iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space

被引:143
作者
Akbar, Shahid [1 ]
Hayat, Maqsood [1 ]
Iqbal, Muhammad [1 ]
Jan, Mian Ahmad [1 ]
机构
[1] Abdul Wali Khan Univ, Dept Comp Sci, Mardan 23200, KP, Pakistan
关键词
Am-PseAAC; Anticancer; SVM; Genetic algorithm; Majority voting; AMINO-ACID-COMPOSITION; SUBCELLULAR LOCATION PREDICTION; SHOCK-PROTEIN FAMILIES; 3 DIFFERENT MODES; WEB SERVER; RECOMBINATION SPOTS; CANCER BIOMARKERS; FEATURE-SELECTION; PSEAAC; SITES;
D O I
10.1016/j.artmed.2017.06.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cancer is a fatal disease, responsible for one-quarter of all deaths in developed countries. Traditional anticancer therapies such as, chemotherapy and radiation, are highly expensive, susceptible to errors and ineffective techniques. These conventional techniques induce severe side-effects on human cells. Due to perilous impact of cancer, the development of an accurate and highly efficient intelligent computational model is desirable for identification of anticancer peptides. In this paper, evolutionary intelligent genetic algorithm-based ensemble model, iACP-GAEnsC', is proposed for the identification of anticancer peptides. In this model, the protein sequences are formulated, using three different discrete feature representation methods, i.e., amphiphilic Pseudo amino acid composition, g-Gap dipeptide composition, and Reduce amino acid alphabet composition. The performance of the extracted feature spaces are investigated separately and then merged to exhibit the significance of hybridization. In addition, the predicted results of individual classifiers are combined together, using optimized genetic algorithm and simple majority technique in order to enhance the true classification rate. It is observed that genetic algorithm based ensemble classification outperforms than individual classifiers as well as simple majority voting base ensemble. The performance of genetic algorithm-based ensemble classification is highly reported on hybrid feature space, with an accuracy of 96.45%. In comparison to the existing techniques, 'iACP-GAEnsC' model has achieved remarkable improvement in terms of various performance metrics. Based on the simulation results, it is observed that 'iACP-GAEnsC' model might be a leading tool in the field of drug design and proteomics for researchers. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:62 / 70
页数:9
相关论文
共 97 条
[1]   Naive Bayes QSDR classification based on spiral-graph Shannon entropies for protein biomarkers in human colon cancer [J].
Aguiar-Pulido, Vanessa ;
Munteanu, Cristian R. ;
Seoane, Jose A. ;
Fernandez-Blanco, Enrique ;
Perez-Montoto, Lazaro G. ;
Gonzalez-Diaz, Humberto ;
Dorado, Julian .
MOLECULAR BIOSYSTEMS, 2012, 8 (06) :1716-1722
[2]   Intelligent computational model for classification of sub-Golgi protein using oversampling and fisher feature selection methods [J].
Ahmad, Jamal ;
Javed, Faisal ;
Hayat, Maqsood .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2017, 78 :14-22
[3]   Identification of Heat Shock Protein families and J-protein types by incorporating Dipeptide Composition into Chou's general PseAAC [J].
Ahmad, Saeed ;
Kabir, Muhammad ;
Hayat, Maqsood .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2015, 122 (02) :165-174
[4]  
Akbar S., 2015, J APPL ENVIRON BIOL, V5, P28
[5]  
[Anonymous], BIOINFORMATICS
[6]  
[Anonymous], REGULARIZATION INTRI
[7]  
[Anonymous], ID DNA REC SPOTS US
[8]  
[Anonymous], CENTR AM PAN CONV CO
[9]  
[Anonymous], 2014, JAEBS
[10]  
[Anonymous], 2014, IJCSI