Development of a two-stage gene selection method that incorporates a novel hybrid approach using the cuckcio optimization algorithm and harmony search for cancer classification

被引:60
作者
Elyasigomari, V. [1 ]
Lee, D. A. [1 ]
Screen, H. R. C. [1 ]
Shaheed, M. H. [1 ]
机构
[1] Queen Mary Univ London, Sch Engn & Mat Sci, London E1 4NS, England
关键词
Gene selection; Minimum redundancy and maximum relevance (MRMR); Evolutionary algorithms; Cuckoo optimization algorithm; Harmony search algorithm; COA-HS; B-CELL LYMPHOMA; RECEPTOR TYROSINE KINASE; ACUTE MYELOID-LEUKEMIA; ALTERED EXPRESSION; MICROARRAY DATA; PARTICLE SWARM; PROSTATE; MACHINE; RON; OVEREXPRESSION;
D O I
10.1016/j.jbi.2017.01.016
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
For each cancer type, only a few genes are informative. Due to the so-called 'curse of dimensionality' problem, the gene selection task remains a challenge. To overcome this problem, we propose a two stage gene selection method called MRMR-COA-HS. In the first stage, the minimum redundancy and maximum relevance (MRMR) feature selection is used to select a subset of relevant genes. The selected genes are then fed into a wrapper setup that combines a new algorithm, COA-HS, using the support vector machine as a classifier. The method was applied to four microarray datasets, and the performance was assessed by the leave one out cross-validation method. Comparative performance assessment of the proposed method with other evolutionary algorithms suggested that the proposed algorithm significantly outperforms other methods in selecting a fewer number of genes while maintaining the highest classification accuracy. The functions of the selected genes were further investigated, and it was confirmed that the selected genes are biologically relevant to each cancer type. (C) 2017 Published by Elsevier Inc.
引用
收藏
页码:11 / 20
页数:10
相关论文
共 85 条
[1]   A comparative study of feature selection and classification methods for gene expression data of glioma [J].
Abusamra, Heba .
4TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SYSTEMS-BIOLOGY AND BIOINFORMATICS (CSBIO2013), 2013, 23 :5-14
[2]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[3]   Microarray data analysis: from disarray to consolidation and consensus [J].
Allison, DB ;
Cui, XQ ;
Page, GP ;
Sabripour, M .
NATURE REVIEWS GENETICS, 2006, 7 (01) :55-65
[4]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[5]  
Alshamlan H., 2014, COMP STUDY CANC CLAS, P389
[6]   Effective dimension reduction methods for tumor classification using gene expression data [J].
Antoniadis, A ;
Lambert-Lacroix, S ;
Leblanc, F .
BIOINFORMATICS, 2003, 19 (05) :563-570
[7]   ACTIVITY OF THYMIDYLATE SYNTHETASE, THYMIDINE KINASE AND GALACTOKINASE IN PRIMARY AND XENOGRAFTED HUMAN COLORECTAL CANCERS IN RELATION TO THEIR CHROMOSOMAL PATTERNS [J].
BARDOT, V ;
LUCCIONI, C ;
LEFRANCOIS, D ;
MULERIS, M ;
DUTRILLAUX, B .
INTERNATIONAL JOURNAL OF CANCER, 1991, 47 (05) :670-674
[8]   Bladder cancer outcome and subtype classification by gene expression [J].
Blaveri, E ;
Simko, JP ;
Korkola, JE ;
Brewer, JL ;
Baehner, F ;
Mehta, K ;
DeVries, S ;
Koppie, T ;
Pejavar, S ;
Carroll, P ;
Waldman, FM .
CLINICAL CANCER RESEARCH, 2005, 11 (11) :4044-4055
[9]   Selection of relevant features and examples in machine learning [J].
Blum, AL ;
Langley, P .
ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) :245-271
[10]   A review of feature selection methods on synthetic data [J].
Bolon-Canedo, Veronica ;
Sanchez-Marono, Noelia ;
Alonso-Betanzos, Amparo .
KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 34 (03) :483-519