Improved swarm-optimization-based filter-wrapper gene selection from microarray data for gene expression tumor classification

被引:30
作者
Ke, Lin [1 ]
Li, Min [1 ]
Wang, Lei [1 ]
Deng, Shaobo [1 ]
Ye, Jun [1 ]
Yu, Xiang [1 ]
机构
[1] Nanchang Inst Technol, Sch Informat Engn, Nanchang, Jiangxi, Peoples R China
基金
美国国家科学基金会;
关键词
Filter-wrapper gene selection; Microarray expression data; Swarm-based optimization; Population initialization; Genetic algorithm; Ant colony optimization; ANT COLONY OPTIMIZATION; ROUGH SET-THEORY; ALGORITHM; REDUCTION; GA;
D O I
10.1007/s10044-022-01117-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A typical microarray dataset usually contains thousands of genes, but only a small number of samples. It is in fact that most genes in a DNA microarray dataset are not relevant for classification. Identifying highly discriminating genes, known as biomarkers, is a challenging task for machine learning-based tumor classification. This study focuses on swarm-optimization-based filter-wrapper gene selection. In general, this type of hybrid gene selection consists of two steps: The first step is the filter step, which selects a small top-n percentage of genes and obtains reduced data; then, the second step searches for the optimal gene subset based on a wrapper model from the remaining genes by using a swarm-optimization-based algorithm. However, the second step of the existing swarm-optimization-based filter-wrapper gene selection is to search only from the remaining genes without using the ranking information of the remaining genes. This new study attempts to fill the gap that has been neglected in the area of swarm-optimization-based filter-wrapper gene selection. In this study, population initialization based on ranking criteria (PIRC) is proposed to transform the population initialization of genetic algorithm (GA) and ant colony optimization (ACO), which are called PIRCGA and PIRCACO, respectively. The experiment was carried out on 17 microarray expression datasets, and the two groups of IG-GA vs. IG-PIRCGA and IG-ACO vs. IG-PIRCACO were compared, respectively. The experimental results prove the efficiency of our proposed methods.
引用
收藏
页码:455 / 472
页数:18
相关论文
共 55 条
[1]   A GA-based feature selection and parameter optimization of an ANN in diagnosing breast cancer [J].
Ahmad, Fadzil ;
Isa, Nor Ashidi Mat ;
Hussain, Zakaria ;
Osman, Muhammad Khusairi ;
Sulaiman, Siti Noraini .
PATTERN ANALYSIS AND APPLICATIONS, 2015, 18 (04) :861-870
[2]   A two-layer feature selection method using Genetic Algorithm and Elastic Net [J].
Amini, Fatemeh ;
Hu, Guiping .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 166
[3]   A mathematical framework for cellular learning automata [J].
Beigy, H ;
Meybodi, MR .
ADVANCES IN COMPLEX SYSTEMS, 2004, 7 (3-4) :295-319
[4]   Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking [J].
Bermejo, Pablo ;
de la Ossa, Luis ;
Gamez, Jose A. ;
Puerta, Jose M. .
KNOWLEDGE-BASED SYSTEMS, 2012, 25 (01) :35-44
[5]   Ensembles for feature selection: A review and future trends [J].
Bolon-Canedo, Veronica ;
Alonso-Betanzos, Amparo .
INFORMATION FUSION, 2019, 52 :1-12
[6]   Efficient ant colony optimization for image feature selection [J].
Chen, Bolun ;
Chen, Ling ;
Chen, Yixin .
SIGNAL PROCESSING, 2013, 93 (06) :1566-1576
[7]  
COLORNI A, 1992, FROM ANIM ANIMAT, P134
[8]   Unsupervised probabilistic feature selection using ant colony optimization [J].
Dadaneh, Behrouz Zamani ;
Markid, Hossein Yeganeh ;
Zakerolhosseini, Ali .
EXPERT SYSTEMS WITH APPLICATIONS, 2016, 53 :27-42
[9]   Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification [J].
Dai, Jianhua ;
Xu, Qing .
APPLIED SOFT COMPUTING, 2013, 13 (01) :211-221
[10]   A group incremental feature selection for classification using rough set theory based genetic algorithm [J].
Das, Asit K. ;
Sengupta, Shampa ;
Bhattacharyya, Siddhartha .
APPLIED SOFT COMPUTING, 2018, 65 :400-411