A Comparative Analysis of Ensemble Classifiers: Case Studies in Genomics

被引:41
作者
Whalen, Sean [1 ]
Pandey, Gaurav [1 ]
机构
[1] Icahn Sch Med Mt Sinai, Icahn Inst Genom & Multiscale Biol, Dept Genet & Genom Sci, New York, NY 10029 USA
来源
2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM) | 2013年
关键词
Bioinformatics; Genomics; Supervised learning; Ensemble methods; Stacking; Ensemble selection; DIVERSITY; FUSION; MODELS;
D O I
10.1109/ICDM.2013.21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The combination of multiple classifiers using ensemble methods is increasingly important for making progress in a variety of difficult prediction problems. We present a comparative analysis of several ensemble methods through two case studies in genomics, namely the prediction of genetic interactions and protein functions, to demonstrate their efficacy on real-world datasets and draw useful conclusions about their behavior. These methods include simple aggregation, meta-learning, cluster-based meta-learning, and ensemble selection using heterogeneous classifiers trained on resampled data to improve the diversity of their predictions. We present a detailed analysis of these methods across 4 genomics datasets and find the best of these methods offer statistically significant improvements over the state of the art in their respective domains. In addition, we establish a novel connection between ensemble selection and meta-learning, demonstrating how both of these disparate methods establish a balance between ensemble diversity and performance.
引用
收藏
页码:807 / 816
页数:10
相关论文
共 58 条
[1]   Comparison of Classifier Fusion Methods for Predicting Response to Anti HIV-1 Therapy [J].
Altmann, Andre ;
Rosen-Zvi, Michal ;
Prosperi, Mattia ;
Aharoni, Ehud ;
Neuvirth, Hani ;
Schuelter, Eugen ;
Buech, Joachim ;
Struck, Daniel ;
Peres, Yardena ;
Incardona, Francesca ;
Soennerborg, Anders ;
Kaiser, Rolf ;
Zazzi, Maurizio ;
Lengauer, Thomas .
PLOS ONE, 2008, 3 (10)
[2]  
[Anonymous], 2011, Cvpr, DOI DOI 10.1109/CVPR.2011.5995316
[3]  
Bellion A., 2012, 2012 15 International Symposium on Antenna Technology and Applied Electromagnetics, P1
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]  
Brier G. W., 1950, Monthly weather review, V78, P1, DOI [DOI 10.1175/1520-0493(1950)078, DOI 10.1175/1520-0493(1950)078ANDLT
[7]  
0001:VOFEITANDGT
[8]  
2.0.CO
[9]  
2, 10.1175/1520-0493(1950)078()0001:VOFEIT()2.0.CO
[10]  
2, DOI 10.1175/1520-0493(1950)0782.0.CO