The ability to classify patients based on gene-expression data varies by algorithm and performance metric

被引:7
作者
Piccolo, Stephen [1 ]
Mecham, Avery [1 ]
Golightly, Nathan [1 ]
Johnson, Jeremie L. [1 ]
Miller, Dustin [1 ]
机构
[1] Brigham Young Univ, Dept Biol, Provo, UT 84602 USA
关键词
DISTANT RECURRENCE; PAM50; RISK; BIG DATA; CLASSIFICATION; CANCER; SELECTION; SCORE; MEDICINE;
D O I
10.1371/journal.pcbi.1009926
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
By classifying patients into subgroups, clinicians can provide more effective care than using a uniform approach for all patients. Such subgroups might include patients with a particular disease subtype, patients with a good (or poor) prognosis, or patients most (or least) likely to respond to a particular therapy. Transcriptomic measurements reflect the downstream effects of genomic and epigenomic variations. However, high-throughput technologies generate thousands of measurements per patient, and complex dependencies exist among genes, so it may be infeasible to classify patients using traditional statistical models. Machine-learning classification algorithms can help with this problem. However, hundreds of classification algorithms exist-and most support diverse hyperparameters-so it is difficult for researchers to know which are optimal for gene-expression biomarkers. We performed a benchmark comparison, applying 52 classification algorithms to 50 gene-expression datasets (143 class variables). We evaluated algorithms that represent diverse machine-learning methodologies and have been implemented in general-purpose, opensource, machine-learning libraries. When available, we combined clinical predictors with gene-expression data. Additionally, we evaluated the effects of performing hyperparameter optimization and feature selection using nested cross validation. Kernel- and ensemble-based algorithms consistently outperformed other types of classification algorithms; however, even the top-performing algorithms performed poorly in some cases. Hyperparameter optimization and feature selection typically improved predictive performance, and univariate feature-selection algorithms typically outperformed more sophisticated methods. Together, our findings illustrate that algorithm performance varies considerably when other factors are held constant and thus that algorithm selection is a critical step in biomarker studies.
引用
收藏
页数:34
相关论文
共 139 条
[61]   DREAMing of benchmarks [J].
Jarchum, Irene ;
Jones, Susan .
NATURE BIOTECHNOLOGY, 2015, 33 (01) :49-50
[62]  
John G. H., 1995, Uncertainty in Artificial Intelligence. Proceedings of the Eleventh Conference (1995), P338
[63]   Adjusting batch effects in microarray expression data using empirical Bayes methods [J].
Johnson, W. Evan ;
Li, Cheng ;
Rabinovic, Ariel .
BIOSTATISTICS, 2007, 8 (01) :118-127
[64]  
Karatzoglou A., 2004, J STAT SOFTW, V11, P1, DOI [10.18637/jss.v011.i09, DOI 10.18637/JSS.V011.I09]
[65]   Improvements to Platt's SMO algorithm for SVM classifier design [J].
Keerthi, SS ;
Shevade, SK ;
Bhattacharyya, C ;
Murthy, KRK .
NEURAL COMPUTATION, 2001, 13 (03) :637-649
[66]  
Kohavi R, 1995, LECT NOTES ARTIF INT, V912, P174
[67]  
Kononenko I., 1994, Machine Learning: ECML-94. European Conference on Machine Learning. Proceedings, P171
[68]  
Koohy Hashem, 2017, F1000Res, V6, P2012, DOI 10.12688/f1000research.13016.1
[69]   Big Data And New Knowledge In Medicine: The Thinking, Training, And Tools Needed For A Learning Health System [J].
Krumholz, Harlan M. .
HEALTH AFFAIRS, 2014, 33 (07) :1163-1170
[70]   USE OF RANKS IN ONE-CRITERION VARIANCE ANALYSIS [J].
KRUSKAL, WH ;
WALLIS, WA .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1952, 47 (260) :583-621