AIPs-SnTCN: Predicting Anti-Inflammatory Peptides Using fastText and Transformer Encoder-Based Hybrid Word Embedding with Self-Normalized Temporal Convolutional Networks

被引:86
作者
Raza, Ali [1 ,2 ]
Uddin, Jamal [1 ]
Almuhaimeed, Abdullah [3 ]
Akbar, Shahid [4 ,5 ]
Zou, Quan [4 ,6 ]
Ahmad, Ashfaq [2 ]
机构
[1] Qurtuba Univ Sci & Informat Technol, Dept Phys & Numer Sci, Peshawar 25124, Khyber Pakhtunk, Pakistan
[2] MY Univ, Dept Comp Sci, Islamabad 45750, Pakistan
[3] King Abdulaziz City Sci & Technol, Digital Hlth Inst, Riyadh 11442, Saudi Arabia
[4] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu 610054, Peoples R China
[5] Abdul Wali Khan Univ Mardan, Dept Comp Sci, Mardan 23200, Khyber Pakhtunk, Pakistan
[6] Univ Elect Sci & Technol China, Yangtze Delta Reg Inst Quzhou, Quzhou 324000, Peoples R China
关键词
DEEP NEURAL-NETWORK; OVERSAMPLING TECHNIQUE; ANTICANCER PEPTIDES; FEATURE-SELECTION; IDENTIFICATION; AUTOIMMUNE; MODEL; CLASSIFICATION; SMOTE; INFLAMMATION;
D O I
10.1021/acs.jcim.3c01563
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Inflammation is a biologically resistant response to harmful stimuli, such as infection, damaged cells, toxic chemicals, or tissue injuries. Its purpose is to eradicate pathogenic micro-organisms or irritants and facilitate tissue repair. Prolonged inflammation can result in chronic inflammatory diseases. However, wet-laboratory-based treatments are costly and time-consuming and may have adverse side effects on normal cells. In the past decade, peptide therapeutics have gained significant attention due to their high specificity in targeting affected cells without affecting healthy cells. Motivated by the significance of peptide-based therapies, we developed a highly discriminative prediction model called AIPs-SnTCN to predict anti-inflammatory peptides accurately. The peptide samples are encoded using word embedding techniques such as skip-gram and attention-based bidirectional encoder representation using a transformer (BERT). The conjoint triad feature (CTF) also collects structure-based cluster profile features. The fused vector of word embedding and sequential features is formed to compensate for the limitations of single encoding methods. Support vector machine-based recursive feature elimination (SVM-RFE) is applied to choose the ranking-based optimal space. The optimized feature space is trained by using an improved self-normalized temporal convolutional network (SnTCN). The AIPs-SnTCN model achieved a predictive accuracy of 95.86% and an AUC of 0.97 by using training samples. In the case of the alternate training data set, our model obtained an accuracy of 92.04% and an AUC of 0.96. The proposed AIPs-SnTCN model outperformed existing models with an similar to 19% higher accuracy and an similar to 14% higher AUC value. The reliability and efficacy of our AIPs-SnTCN model make it a valuable tool for scientists and may play a beneficial role in pharmaceutical design and research academia.
引用
收藏
页码:6537 / 6554
页数:18
相关论文
共 108 条
[1]   iAFPs-EnC-GA: Identifying antifungal peptides using sequential and evolutionary descriptors based multi-information fusion and ensemble learning approach [J].
Ahmad, Ashfaq ;
Akbar, Shahid ;
Tahir, Muhammad ;
Hayat, Maqsood ;
Ali, Farman .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2022, 222
[2]   Deep-AntiFP: Prediction of antifungal peptides using distanct multi-informative features incorporating with deep neural networks [J].
Ahmad, Ashfaq ;
Akbar, Shahid ;
Khan, Salman ;
Hayat, Maqsood ;
Ali, Farman ;
Ahmed, Aftab ;
Tahir, Muhammad .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2021, 208
[3]   Identifying Neuropeptides via Evolutionary and Sequential Based Multi-Perspective Descriptors by Incorporation With Ensemble Classification Strategy [J].
Akbar, Shahid ;
Mohamed, Heba G. ;
Ali, Hashim ;
Saeed, Aamir ;
Khan, Aftab Ahmed ;
Gul, Sarah ;
Ahmad, Ashfaq ;
Ali, Farman ;
Ghadi, Yazeed Yasin ;
Assam, Muhammad .
IEEE ACCESS, 2023, 11 :49024-49034
[4]   Prediction of Amyloid Proteins Using Embedded Evolutionary & Ensemble Feature Selection Based Descriptors With eXtreme Gradient Boosting Model [J].
Akbar, Shahid ;
Ali, Hashim ;
Ahmad, Ashfaq ;
Sarker, Mahidur R. R. ;
Saeed, Aamir ;
Salwana, Ely ;
Gul, Sarah ;
Khan, Ahmad ;
Ali, Farman .
IEEE ACCESS, 2023, 11 :39024-39036
[5]   cACP-DeepGram: Classification of anticancer peptides via deep neural network and skip-gram-based word embedding model [J].
Akbar, Shahid ;
Hayat, Maqsood ;
Tahir, Muhammad ;
Khan, Salman ;
Alarfaj, Fawaz Khaled .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2022, 131
[6]   iHBP-DeepPSSM: Identifying hormone binding proteins using PsePSSM based evolutionary features and deep learning approach [J].
Akbar, Shahid ;
Khan, Salman ;
Ali, Farman ;
Hayat, Maqsood ;
Qasim, Muhammad ;
Gul, Sarah .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2020, 204
[7]   cACP-2LFS: Classification of Anticancer Peptides Using Sequential Discriminative Model of KSAAP and Two-Level Feature Selection Approach [J].
Akbar, Shahid ;
Hayat, Maqsood ;
Tahir, Muhammad ;
Chong, Kil To .
IEEE ACCESS, 2020, 8 :131939-131948
[8]   cACP: Classifying anticancer peptides using discriminative intelligent model via Chou's 5-step rules and general pseudo components [J].
Akbar, Shahid ;
Rahman, Ateeq Ur ;
Hayat, Maqsood ;
Sohail, Mohammad .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2020, 196
[9]   iRNA-PseTNC: identification of RNA 5-methylcytosine sites using hybrid vector space of pseudo nucleotide composition [J].
Akbar, Shahid ;
Hayat, Maqsood ;
Iqbal, Muhammad ;
Tahir, Muhammad .
FRONTIERS OF COMPUTER SCIENCE, 2020, 14 (02) :451-460
[10]   iAFP-gap-SMOTE: An Efficient Feature Extraction Scheme Gapped Dipeptide Composition is Coupled with an Oversampling Technique for Identification of Antifreeze Proteins [J].
Akbar, Shahid ;
Hayat, Maqsood ;
Kabir, Muhammad ;
Iqbal, Muhammad .
LETTERS IN ORGANIC CHEMISTRY, 2019, 16 (04) :294-302