生信机器学习平台

目前使用多种机器学习算法寻优,是生信分析高分文章的常用思路。如最近发表在《Nature Communications》(IF: 14.919)的文章《Machine learning-based integration develops an immune-derived lncRNA signature for improving outcomes in colorectal cancer》,就使用了101 种机器学习预测模型进行分析,最终发现最佳模型是Lasso 和逐步 Cox回归的组合,该组合模型在所有验证数据集中都具有很高的C-index.

最优模型的搜索

螺旋矩阵公司开发的生信多模型机器学习系统 BioMatrix ,一次性运行上百个模型,并自动进行贝叶斯超参数优化。搜索最优解决方案。以下为部分支持算法,欢迎各位老师,联系使用

  • Linear Models
    • Ordinary Least Squares
    • Ridge regression and classification
    • Lasso
    • Multi-task Lasso
    • Elastic-Net
    • Multi-task Elastic-Net
    • Least Angle Regression
    • LARS Lasso
    • Orthogonal Matching Pursuit (OMP)
    • Bayesian Regression
    • Logistic regression
    • Generalized Linear Regression
    • Stochastic Gradient Descent – SGD
    • Perceptron
    • Passive Aggressive Algorithms
    • Robustness regression: outliers and modeling errors
    • Quantile Regression
    • Polynomial regression: extending linear models with basis functions
  • Linear and Quadratic Discriminant Analysis
    • Dimensionality reduction using Linear Discriminant Analysis
    • Mathematical formulation of the LDA and QDA classifiers
    • Mathematical formulation of LDA dimensionality reduction
    • Shrinkage and Covariance Estimator
    • Estimation algorithms
  • Kernel ridge regression
  • Support Vector Machines
    • Classification
    • Regression
    • Density estimation, novelty detection
    • Complexity
    • Tips on Practical Use
    • Kernel functions
    • Mathematical formulation
    • Implementation details
  • Stochastic Gradient Descent
    • Classification
    • Regression
    • Online One-Class SVM
    • Stochastic Gradient Descent for sparse data
    • Complexity
    • Stopping criterion
    • Tips on Practical Use
    • Mathematical formulation
    • Implementation details
  • Nearest Neighbors
    • Unsupervised Nearest Neighbors
    • Nearest Neighbors Classification
    • Nearest Neighbors Regression
    • Nearest Neighbor Algorithms
    • Nearest Centroid Classifier
    • Nearest Neighbors Transformer
    • Neighborhood Components Analysis
  • Gaussian Processes
    • Gaussian Process Regression (GPR)
    • GPR examples
    • Gaussian Process Classification (GPC)
    • GPC examples
    • Kernels for Gaussian Processes
  • Cross decomposition
    • PLSCanonical
    • PLSSVD
    • PLSRegression
    • Canonical Correlation Analysis
  • Naive Bayes
    • Gaussian Naive Bayes
    • Multinomial Naive Bayes
    • Complement Naive Bayes
    • Bernoulli Naive Bayes
    • Categorical Naive Bayes
    • Out-of-core naive Bayes model fitting
  • Decision Trees
    • Classification
    • Regression
    • Multi-output problems
    • Complexity
    • Tips on practical use
    • Tree algorithms: ID3, C4.5, C5.0 and CART
    • Mathematical formulation
    • Minimal Cost-Complexity Pruning
  • Ensemble methods
    • Bagging meta-estimator
    • Forests of randomized trees
    • AdaBoost
    • Gradient Tree Boosting
    • Histogram-Based Gradient Boosting
    • Voting Classifier
    • Voting Regressor
    • Stacked generalization
  • Multiclass and multioutput algorithms
    • Multiclass classification
    • Multilabel classification
    • Multiclass-multioutput classification
    • Multioutput regression
  • Feature selection
  • Semi-supervised learning
    • Self Training
    • Label Propagation
  • Isotonic regression
  • Probability calibration
    • 1.16.1. Calibration curves
    • 1.16.2. Calibrating a classifier
  • 1.17. Neural network models
    • Multi-layer Perceptron
    • Classification
    • Regression
    • Regularization