pAtbP-EnC: Identifying Anti-Tubercular Peptides Using Multi-Feature Representation and Genetic Algorithm-Based Deep Ensemble Model

被引:51
作者
Akbar, Shahid [1 ]
Raza, Ali [2 ]
Al Shloul, Tamara [3 ]
Ahmad, Ashfaq [2 ]
Saeed, Aamir [4 ]
Ghadi, Yazeed Yasin [5 ]
Mamyrbayev, Orken [6 ]
Tag-Eldin, Elsayed [7 ]
机构
[1] Abdul Wali Khan Univ Mardan, Dept Comp Sci, Mardan 23200, Khyber Pakhtunk, Pakistan
[2] MY Univ Islamabad, Dept Comp Sci, Islamabad 44000, Pakistan
[3] Liwa Coll Technol, Dept Gen Educ, Abu Dhabi, U Arab Emirates
[4] Univ Engn & Technol, Dept Comp Sci & IT, Peshawar 25000, Pakistan
[5] Al Ain Univ, Dept Comp Sci, Abu Dhabi, U Arab Emirates
[6] Inst Informat & Computat Technol, Alma Ata 050010, Kazakhstan
[7] Future Univ Egypt, Fac Engn & Technol, New Cairo 11835, Egypt
关键词
Amino acids; Peptides; Training; Tuberculosis; Predictive models; Numerical models; Computational modeling; Anti-tubercular peptides; ensemble classification; genetic algorithm; hybrid representation; k-fold cross-validation test; ACCURATE PREDICTION; ANTICANCER PEPTIDES; LEARNING FRAMEWORK; CLASSIFICATION; IDENTIFICATION; PROTEINS;
D O I
10.1109/ACCESS.2023.3321100
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Mycobacterium tuberculosis, a highly perilous pathogen in humans, serves as the causative agent of tuberculosis (TB), affecting nearly 33% of the global population. With the increasing prevalence of multidrug-resistant TB, there is a need for novel and efficacious alternative therapies. Peptide therapies have emerged as a favorable alternative due to their remarkable specificity in targeting cells without affecting healthy cells. However, the experimental identification methods of anti-tubercular peptides (AtbPs) are labor-intensive and costly. Therefore, accurate prediction of AtbPs has become challenging due to the large number of peptide samples. In this paper, we propose an ensemble learning model to enhance the prediction outcomes by addressing the limitations of individual learning models. We formulate the training samples by utilizing four distinct representation methods: AAindex, Composition/Transition/Distribution, Dipeptide Deviation from Expected Mean, and Enhanced Grouped Amino Acid Composition to numerically encode peptide samples. The feature vectors extracted from these methods are fused to develop a compact vector. We evaluate the prediction rates using three different classification models, employing both individual and heterogeneous vectors. Furthermore, we enhance the prediction and training capabilities of the proposed model by using the predicted labels of the individual classifiers for implementing an ensemble deep model via a genetic algorithm. Through evaluation of both the training datasets and independent datasets, our proposed ensemble learner achieves impressive accuracies of 97.80%, 95.13%, 93.91%, and 94.17%, using RD training, MD training, RD independent, and MD independent datasets, respectively. Our findings demonstrate that the proposed pAtbP-EnC model outperforms existing predictors by reporting approximately 11% higher training accuracy. We conclude that the pAtbP-EnC predictor will be a considerable tool in the field of pharmaceutical design and research academia. The used datasets and the source code are publicly available at https://github.com/Intelligent-models/pAtbP-EnC2023.
引用
收藏
页码:137099 / 137114
页数:16
相关论文
共 70 条
[1]   iAFPs-EnC-GA: Identifying antifungal peptides using sequential and evolutionary descriptors based multi-information fusion and ensemble learning approach [J].
Ahmad, Ashfaq ;
Akbar, Shahid ;
Tahir, Muhammad ;
Hayat, Maqsood ;
Ali, Farman .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2022, 222
[2]   Deep-AntiFP: Prediction of antifungal peptides using distanct multi-informative features incorporating with deep neural networks [J].
Ahmad, Ashfaq ;
Akbar, Shahid ;
Khan, Salman ;
Hayat, Maqsood ;
Ali, Farman ;
Ahmed, Aftab ;
Tahir, Muhammad .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2021, 208
[3]   SCORPION is a stacking-based ensemble learning framework for accurate prediction of phage virion proteins [J].
Ahmad, Saeed ;
Charoenkwan, Phasit ;
Quinn, Julian M. W. ;
Moni, Mohammad Ali ;
Hasan, Md Mehedi ;
Lio, Pietro ;
Shoombuatong, Watshara .
SCIENTIFIC REPORTS, 2022, 12 (01)
[4]   Identifying Neuropeptides via Evolutionary and Sequential Based Multi-Perspective Descriptors by Incorporation With Ensemble Classification Strategy [J].
Akbar, Shahid ;
Mohamed, Heba G. ;
Ali, Hashim ;
Saeed, Aamir ;
Khan, Aftab Ahmed ;
Gul, Sarah ;
Ahmad, Ashfaq ;
Ali, Farman ;
Ghadi, Yazeed Yasin ;
Assam, Muhammad .
IEEE ACCESS, 2023, 11 :49024-49034
[5]   Prediction of Amyloid Proteins Using Embedded Evolutionary & Ensemble Feature Selection Based Descriptors With eXtreme Gradient Boosting Model [J].
Akbar, Shahid ;
Ali, Hashim ;
Ahmad, Ashfaq ;
Sarker, Mahidur R. R. ;
Saeed, Aamir ;
Salwana, Ely ;
Gul, Sarah ;
Khan, Ahmad ;
Ali, Farman .
IEEE ACCESS, 2023, 11 :39024-39036
[6]   Prediction of Antiviral peptides using transform evolutionary & SHAP analysis based descriptors by incorporation with ensemble learning strategy [J].
Akbar, Shahid ;
Ali, Farman ;
Hayat, Maqsood ;
Ahmad, Ashfaq ;
Khan, Salman ;
Gul, Sarah .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2022, 230
[7]   cACP-DeepGram: Classification of anticancer peptides via deep neural network and skip-gram-based word embedding model [J].
Akbar, Shahid ;
Hayat, Maqsood ;
Tahir, Muhammad ;
Khan, Salman ;
Alarfaj, Fawaz Khaled .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2022, 131
[8]   iAtbP-Hyb-EnC: Prediction of antitubercular peptides via heterogeneous feature representation and genetic algorithm based ensemble learning model [J].
Akbar, Shahid ;
Ahmad, Ashfaq ;
Hayat, Maqsood ;
Rehman, Ateeq Ur ;
Khan, Salman ;
Ali, Farman .
COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 137
[9]   iHBP-DeepPSSM: Identifying hormone binding proteins using PsePSSM based evolutionary features and deep learning approach [J].
Akbar, Shahid ;
Khan, Salman ;
Ali, Farman ;
Hayat, Maqsood ;
Qasim, Muhammad ;
Gul, Sarah .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2020, 204
[10]   cACP-2LFS: Classification of Anticancer Peptides Using Sequential Discriminative Model of KSAAP and Two-Level Feature Selection Approach [J].
Akbar, Shahid ;
Hayat, Maqsood ;
Tahir, Muhammad ;
Chong, Kil To .
IEEE ACCESS, 2020, 8 :131939-131948