共 70 条
pAtbP-EnC: Identifying Anti-Tubercular Peptides Using Multi-Feature Representation and Genetic Algorithm-Based Deep Ensemble Model
被引:51
作者:
Akbar, Shahid
[1
]
Raza, Ali
[2
]
Al Shloul, Tamara
[3
]
Ahmad, Ashfaq
[2
]
Saeed, Aamir
[4
]
Ghadi, Yazeed Yasin
[5
]
Mamyrbayev, Orken
[6
]
Tag-Eldin, Elsayed
[7
]
机构:
[1] Abdul Wali Khan Univ Mardan, Dept Comp Sci, Mardan 23200, Khyber Pakhtunk, Pakistan
[2] MY Univ Islamabad, Dept Comp Sci, Islamabad 44000, Pakistan
[3] Liwa Coll Technol, Dept Gen Educ, Abu Dhabi, U Arab Emirates
[4] Univ Engn & Technol, Dept Comp Sci & IT, Peshawar 25000, Pakistan
[5] Al Ain Univ, Dept Comp Sci, Abu Dhabi, U Arab Emirates
[6] Inst Informat & Computat Technol, Alma Ata 050010, Kazakhstan
[7] Future Univ Egypt, Fac Engn & Technol, New Cairo 11835, Egypt
来源:
关键词:
Amino acids;
Peptides;
Training;
Tuberculosis;
Predictive models;
Numerical models;
Computational modeling;
Anti-tubercular peptides;
ensemble classification;
genetic algorithm;
hybrid representation;
k-fold cross-validation test;
ACCURATE PREDICTION;
ANTICANCER PEPTIDES;
LEARNING FRAMEWORK;
CLASSIFICATION;
IDENTIFICATION;
PROTEINS;
D O I:
10.1109/ACCESS.2023.3321100
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
Mycobacterium tuberculosis, a highly perilous pathogen in humans, serves as the causative agent of tuberculosis (TB), affecting nearly 33% of the global population. With the increasing prevalence of multidrug-resistant TB, there is a need for novel and efficacious alternative therapies. Peptide therapies have emerged as a favorable alternative due to their remarkable specificity in targeting cells without affecting healthy cells. However, the experimental identification methods of anti-tubercular peptides (AtbPs) are labor-intensive and costly. Therefore, accurate prediction of AtbPs has become challenging due to the large number of peptide samples. In this paper, we propose an ensemble learning model to enhance the prediction outcomes by addressing the limitations of individual learning models. We formulate the training samples by utilizing four distinct representation methods: AAindex, Composition/Transition/Distribution, Dipeptide Deviation from Expected Mean, and Enhanced Grouped Amino Acid Composition to numerically encode peptide samples. The feature vectors extracted from these methods are fused to develop a compact vector. We evaluate the prediction rates using three different classification models, employing both individual and heterogeneous vectors. Furthermore, we enhance the prediction and training capabilities of the proposed model by using the predicted labels of the individual classifiers for implementing an ensemble deep model via a genetic algorithm. Through evaluation of both the training datasets and independent datasets, our proposed ensemble learner achieves impressive accuracies of 97.80%, 95.13%, 93.91%, and 94.17%, using RD training, MD training, RD independent, and MD independent datasets, respectively. Our findings demonstrate that the proposed pAtbP-EnC model outperforms existing predictors by reporting approximately 11% higher training accuracy. We conclude that the pAtbP-EnC predictor will be a considerable tool in the field of pharmaceutical design and research academia. The used datasets and the source code are publicly available at https://github.com/Intelligent-models/pAtbP-EnC2023.
引用
收藏
页码:137099 / 137114
页数:16
相关论文