A novel ensemble approach for multicategory classification of DNA microarray data using biological relevant gene sets

被引:7
作者
Reboiro-Jato, Miguel [1 ]
Glez-Pena, Daniel [1 ]
Diaz, Fernando [2 ]
Fdez-Riverola, Florentino [1 ]
机构
[1] Univ Vigo, Dept Informat, Escuela Super Ingn Informat, Orense 32004, Spain
[2] Univ Valladolid, Dept Informat, Escuela Univ Informat, Segovia 40005, Spain
关键词
microarray data; multicategory classification; ensemble classifiers; gene sets; knowledge integration; kappa statistic; PREDICTION; KNOWLEDGE; SELECTION;
D O I
10.1504/IJDMB.2012.050267
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
An important emerging medical application domain for microarray technology is clinical decision support in the form of diagnosis of diseases. For this task, several computational methods ranging from statistical alternatives to more complex hybrid systems have been previously proposed in the literature. In this work we study the utilisation of several ensemble alternatives for the task of classifying microarray data by using prior knowledge known to be biologically relevant to the target disease. The experimental results using different datasets and several gene sets show that the proposal is able to outperform previous approaches by introducing diversity as different gene sets.
引用
收藏
页码:602 / 616
页数:15
相关论文
共 32 条
[1]  
[Anonymous], 2004, COMBINING PATTERN CL, DOI DOI 10.1002/0471660264
[2]   Towards knowledge-based gene expression data mining [J].
Bellazzi, Riccardo ;
Zupan, Blaz .
JOURNAL OF BIOMEDICAL INFORMATICS, 2007, 40 (06) :787-802
[3]   Incorporating pathway information into boosting estimation of high-dimensional risk prediction models [J].
Binder, Harald ;
Schumacher, Martin .
BMC BIOINFORMATICS, 2009, 10
[4]  
Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1023/A:1018054314350
[5]  
Bullinger L., 2004, NEW ENGL J MED, V350, P1506
[6]  
Cordero Francesca, 2007, Briefings in Functional Genomics & Proteomics, V6, P265, DOI 10.1093/bfgp/elm034
[7]   Gene selection and classification of microarray data using random forest -: art. no. 3 [J].
Díaz-Uriarte, R ;
de Andrés, SA .
BMC BIOINFORMATICS, 2006, 7 (1)
[8]   Comparison of Evaluation Metrics in Classification Applications with Imbalanced Datasets [J].
Fatourechi, Mehrdad ;
Ward, Rabab K. ;
Mason, Steven G. ;
Huggins, Jane ;
Schloegl, Alois ;
Birch, Gary E. .
SEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2008, :777-+
[9]  
Freund Y., 1996, Machine Learning. Proceedings of the Thirteenth International Conference (ICML '96), P148
[10]  
Glez-Pena D., 2010, 13 INT C INF FUS 26, P1