Model selection based on particle swarm optimization for omics data classification

被引:1
作者
Xu, Zhao [1 ]
Yang, Junshan [2 ]
机构
[1] Dalian Univ Sci & Technol, Sch Elect Engn, Dalian, Peoples R China
[2] Dalian Univ Foreign Languages, Sch Software, Dalian, Peoples R China
来源
2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020) | 2020年
关键词
omics dataset; binary particle swarm optimization; data sampling; feature selection; classification; model selection;
D O I
10.1109/ICMCCE51767.2020.00293
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A new model selection algorithm based on binary particle swarm optimization is proposed for omics data classification. Particularly, the algorithm is designed to handle the high dimensionality, small sample size and class imbalance problems that are inherent in omics data. The particles encode candidate combinations of data sampling, feature selection, classification models and their corresponding parameter settings. The binary swarm optimization is targeted at the best classification performance. The particle velocity and position are iteratively updated until some stopping iteration is met and the optimal solution model combination is output. The simulative results on eight real-world omics datasets show that the proposed model selection algorithm is capable of avoiding the bias introduced by manual settings and leading to accurate and reliable classification performance.
引用
收藏
页码:1334 / 1337
页数:4
相关论文
共 8 条
[1]   Selection bias in gene extraction on the basis of microarray gene-expression data [J].
Ambroise, C ;
McLachlan, GJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) :6562-6566
[2]   A Critical Assessment of Feature Selection Methods for Biomarker Discovery in Clinical Proteomics [J].
Christin, Christin ;
Hoefsloot, Huub C. J. ;
Smilde, Age K. ;
Hoekman, B. ;
Suits, Frank ;
Bischoff, Rainer ;
Horvatovich, Peter .
MOLECULAR & CELLULAR PROTEOMICS, 2013, 12 (01) :263-276
[3]  
Escalante HJ, 2009, J MACH LEARN RES, V10, P405
[4]   Predicting cancer phenotypes with mechanism-driven multi-omics data integration [J].
Marchionni, Luigi ;
Geman, Donald .
CANCER RESEARCH, 2015, 75
[5]  
Momma M, 2002, SIAM PROC S, P261
[6]   A machine learning heuristic to identify biologically relevant and minimal biomarker panels from omics data [J].
Swan, Anna L. ;
Stekel, Dov J. ;
Hodgman, Charlie ;
Allaway, David ;
Algahtani, Mohammed H. ;
Mobasheri, Ali ;
Bacardit, Jaume .
BMC GENOMICS, 2015, 16
[7]   ROSEFW-RF: The winner algorithm for the ECBDL'14 big data competition: An extremely imbalanced big data bioinformatics problem [J].
Triguero, Isaac ;
del Rio, Sara ;
Lopez, Victoria ;
Bacardit, Jaume ;
Benitez, Jose M. ;
Herrera, Francisco .
KNOWLEDGE-BASED SYSTEMS, 2015, 87 :69-79
[8]   Independent Principal Component Analysis for biologically meaningful dimension reduction of large biological data sets [J].
Yao, Fangzhou ;
Coquery, Jeff ;
Le Cao, Kim-Anh .
BMC BIOINFORMATICS, 2012, 13