A Hybrid Ensemble Algorithm Combining AdaBoost and Genetic Algorithm for Cancer Classification with Gene Expression Data

被引:5
作者
Lu, Huijuan [1 ]
Gao, Huiyun [1 ]
Ye, Minchao [1 ]
Yan, Ke [1 ]
Wang, Xiuhui [1 ]
机构
[1] China Jiliang Univ, Coll Informat Engn, Hangzhou, Zhejiang, Peoples R China
来源
2018 NINTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY IN MEDICINE AND EDUCATION (ITME 2018) | 2018年
基金
中国国家自然科学基金;
关键词
AdaBoost; Decision Group; K-Nearest Neighbor; Naive Bayes; Decision Tree; Genetic Algorithm;
D O I
10.1109/ITME.2018.00015
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
There are two key issues in the field of ensemble learning: (1) diversity of base classifiers; (2) the way of integrating multiple classifiers. In this paper, a special classifier structure, namely, decision group, is designed to increase the diversity of base classifier pool; and the genetic algorithm (GA) is used to assign weight to each base classifier, thus to improve the classification performance by avoiding local extremes. Overall, this work presents an ensemble classification algorithm based on AdaBoost. The base classifiers are decision groups composed by base classifiers, including K-nearest-neighbor (KNN), naive Bayes (NB) and decision tree (C-4.5). Aiming at the characteristics of high dimensional and small samples of cancer gene expression data, a simple ensemble algorithm with decision groups composed of three base classifiers is proposed. Experimental results show that the proposed algorithm is superior to existing ensemble learning methods, such as Bagging, Random Forest (RF), Rotation Forest (RoF), AdaBoost, AdaBoost-BPNN, AdaBoost-SVM and AdaBoost-RF, and especially it has better performance on small sample and unbalanced gene expression data processing.
引用
收藏
页码:15 / 19
页数:5
相关论文
共 21 条
[1]  
[Anonymous], 2016, MEMET COMPUT
[2]   Implementing ReliefF filters to extract meaningful features from genetic lifetime datasets [J].
Beretta, Lorenzo ;
Santaniello, Alessandro .
JOURNAL OF BIOMEDICAL INFORMATICS, 2011, 44 (02) :361-369
[3]   Bagging based Support Vector Machines for spatial prediction of landslides [J].
Binh Thai Pham ;
Dieu Tien Bui ;
Prakash, Indra .
ENVIRONMENTAL EARTH SCIENCES, 2018, 77 (04)
[4]   A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification [J].
Cao, Jianfang ;
Chen, Lichao ;
Wang, Min ;
Shi, Hao ;
Tian, Yun .
SCIENTIFIC REPORTS, 2016, 6
[5]   Enhancing the performance of hybrid genetic algorithms by differential improvement [J].
Drezner, Zvi ;
Misevicius, Alfonsas .
COMPUTERS & OPERATIONS RESEARCH, 2013, 40 (04) :1038-1046
[6]   A naive Bayes algorithm for tissue origin diagnosis (TOD-Bayes) of synchronous multifocal tumors in the hepatobiliary and pancreatic system [J].
Jiang, Weiqin ;
Shen, Yifei ;
Ding, Yongfeng ;
Ye, Chuyu ;
Zheng, Yi ;
Zhao, Peng ;
Liu, Lulu ;
Tong, Zhou ;
Zhou, Linfu ;
Sun, Shuo ;
Zhang, Xingchen ;
Teng, Lisong ;
Timko, Michael P. ;
Fan, Longjiang ;
Fang, Weijia .
INTERNATIONAL JOURNAL OF CANCER, 2018, 142 (02) :357-368
[7]   Comparison of four Adaboost algorithm based artificial neural networks in wind speed predictions [J].
Liu, Hui ;
Tian, Hong-qi ;
Li, Yan-fei ;
Zhang, Lei .
ENERGY CONVERSION AND MANAGEMENT, 2015, 92 :67-81
[8]  
Liu Y, 2016, COMPUTATIONAL INTELL, V2016
[9]   Dissimilarity based ensemble of extreme learning machine for gene expression data classification [J].
Lu, Hui-juan ;
An, Chun-lin ;
Zheng, En-hui ;
Lu, Yi .
NEUROCOMPUTING, 2014, 128 :22-30
[10]   Improved Bounds for the Randomized Decision Tree Complexity of Recursive Majority [J].
Magniez, Frederic ;
Nayak, Ashwin ;
Santha, Miklos ;
Sherman, Jonah ;
Tardos, Gabor ;
Xiao, David .
RANDOM STRUCTURES & ALGORITHMS, 2016, 48 (03) :612-638