genEnsemble: A new model for the combination of classifiers and integration of biological knowledge applied to genomic data

被引:6
|
作者
Reboiro-Jato, Miguel [1 ]
Laza, Rosalia [1 ]
Lopez-Fernandez, Hugo [1 ]
Glez-Pena, Daniel [1 ]
Diaz, Fernando [2 ]
Fdez-Riverola, Florentino [1 ]
机构
[1] Escuela Super Ingn Informat, Orense 32004, Spain
[2] Univ Valladolid, Escuela Univ Informat, Segovia 40005, Spain
关键词
Ensemble approaches; Microarray data classification; Knowledge integration; Inter-dataset robustness; INCORPORATING PRIOR KNOWLEDGE; SUPPORT VECTOR MACHINE; GENE-EXPRESSION DATA; CLASSIFICATION ACCURACY; MICROARRAY; ALGORITHMS;
D O I
10.1016/j.eswa.2012.07.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the last years, microarray technology has become widely used in relevant biomedical areas such as drug target identification, pharmacogenomics or clinical research. However, the necessary prerequisites for the development of valuable translational microarray-based diagnostic tools are (i) a solid understanding of the relative strengths and weaknesses of underlying classification methods and (ii) a biologically plausible and understandable behaviour of such models from a biological point of view. In this paper we propose a novel classifier able to combine the advantages of ensemble approaches with the benefits obtained from the true integration of biological knowledge in the classification process of different microarray samples. The aim of the current work is to guarantee the robustness of the proposed classification model when applied to several microarray data in an inter-dataset scenario. The comparative experimental results demonstrated that our proposal working with biological knowledge outperforms other well-known simple classifiers and ensemble alternatives in binary and multiclass cancer prediction problems using publicly available data. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:52 / 63
页数:12
相关论文
共 5 条
  • [1] An on demand data integration model for biological databases
    Palakal, Mathew
    Naidu, Pavithra
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2009, 3 (01) : 40 - 54
  • [2] Investigation of a new GRASP-based clustering algorithm applied to biological data
    Nascimento, Maria C. V.
    Toledo, Franklina M. B.
    de Carvalho, Andre C. P. L. F.
    COMPUTERS & OPERATIONS RESEARCH, 2010, 37 (08) : 1381 - 1388
  • [3] Simultaneous analysis of distinct Omics data sets with integration of biological knowledge: Multiple Factor Analysis approach
    de Tayrac, Marie
    Le, Sebastien
    Aubry, Marc
    Mosser, Jean
    Husson, Francois
    BMC GENOMICS, 2009, 10 : 32
  • [4] An interpretative model from the elasticity theory to explore knowledge integration in new product development
    Corallo, Angelo
    Lazoi, Mariangela
    Secundo, Giustina
    DePaolis, Paolo
    KNOWLEDGE MANAGEMENT RESEARCH & PRACTICE, 2016, 14 (04) : 478 - 488
  • [5] Prediction of steam/water stratified flow characteristics in NPPs transients using SVM learning algorithm with combination of thermal-hydraulic model and new data mapping technique
    Moshkbar-Bakhshayesh, Khalil
    Ghafari, Mohsen
    ANNALS OF NUCLEAR ENERGY, 2022, 166