IF-AIP: A machine learning method for the identification of anti-inflammatory peptides using multi-feature fusion strategy

被引:24
作者
Gaffar, Saima [1 ]
Hassan, Mir Tanveerul [1 ]
Tayara, Hilal [2 ]
Chong, Kil To [1 ,3 ]
机构
[1] Jeonbuk Natl Univ, Dept Elect & Informat Engn, Jeonju 54896, South Korea
[2] Jeonbuk Natl Univ, Sch Int Engn & Sci, Jeonju 54896, South Korea
[3] Jeonbuk Natl Univ, Adv Elect & Informat Res Ctr, Jeonju 54896, South Korea
基金
新加坡国家研究基金会;
关键词
Anti-inflammatory peptides; Machine learning; Voting classifier; Bioinformatics; AMINO-ACID-COMPOSITION; INFLAMMATION; MECHANISMS;
D O I
10.1016/j.compbiomed.2023.107724
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: The most commonly used therapy currently for inflammatory and autoimmune diseases is non-specific anti-inflammatory drugs, which have various hazardous side effects. Recently, some anti-inflammatory peptides (AIPs) have been found to be a substitute therapy for inflammatory diseases like rheumatoid arthritis and Alzheimer's. Therefore, the identification of these AIPs is an emerging topic that is equally important.Methods: In this work, we have proposed an identification model for AIPs using a voting classifier. We used eight different feature descriptors and five conventional machine-learning classifiers. The eight feature encodings were concatenated to get a hybrid feature set. The five baseline models trained on the hybrid feature set were integrated via a voting classifier. Finally, a feature selection algorithm was used to select the optimal feature set for the construction of our final model, named IF-AIP.Results: We tested the proposed model on two independent datasets. On independent data 1, the IF-AIP model shows an improvement of 3%-5.6% in terms of accuracies and 6.7%-10.8% in terms of MCC compared to the existing methods. On the independent dataset 2, our model IF-AIP shows an overall improvement of 2.9%-5.7% in terms of accuracy and 8.3%-8.6% in terms of MCC score compared to the existing methods. A comparative performance analysis was conducted between the proposed model and existing methods using a set of 24 novel peptide sequences. Notably, the IF-AIP method exhibited exceptional accuracy, correctly identifying all 24 peptides as AIPs. The source code, pre-trained models, and all datasets are made available at https://github.com/Mir-Saima/IF-AIP.
引用
收藏
页数:8
相关论文
共 39 条
[1]  
Abbas Z., 2023, Molecular Therapy
[2]   Optuna: A Next-generation Hyperparameter Optimization Framework [J].
Akiba, Takuya ;
Sano, Shotaro ;
Yanase, Toshihiko ;
Ohta, Takeru ;
Koyama, Masanori .
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, :2623-2631
[3]   A CNN-Based RNA N6-Methyladenosine Site Predictor for Multiple Species Using Heterogeneous Features Representation [J].
Alam, Waleed ;
Ali, Syed Danish ;
Tayara, Hilal ;
Chong, Kil To .
IEEE ACCESS, 2020, 8 :138203-138209
[4]   Identification of Functional piRNAs Using a Convolutional Neural Network [J].
Ali, Syed Danish ;
Alam, Waleed ;
Tayara, Hilal ;
Chong, Kil To .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (03) :1661-1669
[5]   UniProt: a worldwide hub of protein knowledge [J].
Bateman, Alex ;
Martin, Maria-Jesus ;
Orchard, Sandra ;
Magrane, Michele ;
Alpi, Emanuele ;
Bely, Benoit ;
Bingley, Mark ;
Britto, Ramona ;
Bursteinas, Borisas ;
Busiello, Gianluca ;
Bye-A-Jee, Hema ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dogan, Tunca ;
Castro, Leyla Garcia ;
Garmiri, Penelope ;
Georghiou, George ;
Gonzales, Daniel ;
Gonzales, Leonardo ;
Hatton-Ellis, Emma ;
Ignatchenko, Alexandr ;
Ishtiaq, Rizwan ;
Jokinen, Petteri ;
Joshi, Vishal ;
Jyothi, Dushyanth ;
Lopez, Rodrigo ;
Luo, Jie ;
Lussi, Yvonne ;
MacDougall, Alistair ;
Madeira, Fabio ;
Mahmoudy, Mahdi ;
Menchi, Manuela ;
Nightingale, Andrew ;
Onwubiko, Joseph ;
Palka, Barbara ;
Pichler, Klemens ;
Pundir, Sangya ;
Qi, Guoying ;
Raj, Shriya ;
Renaux, Alexandre ;
Lopez, Milagros Rodriguez ;
Saidi, Rabie ;
Sawford, Tony ;
Shypitsyna, Aleksandra ;
Speretta, Elena ;
Turner, Edward ;
Tyagi, Nidhi ;
Vasudev, Preethi ;
Volynkin, Vladimir ;
Wardell, Tony .
NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) :D506-D515
[6]   Classification of nuclear receptors based on amino acid composition and dipeptide composition [J].
Bhasin, M ;
Raghava, GPS .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2004, 279 (22) :23262-23266
[7]   SAPPHIRE: A stacking-based ensemble learning framework for accurate prediction of thermophilic proteins [J].
Charoenkwan, Phasit ;
Schaduangrat, Nalini ;
Moni, Mohammad Ali ;
Lio, Pietro ;
Manavalan, Balachandran ;
Shoombuatong, Watshara .
COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 146
[8]   StackIL6: a stacking ensemble model for improving the prediction of IL-6 inducing peptides [J].
Charoenkwan, Phasit ;
Chiangjong, Wararat ;
Nantasenamat, Chanin ;
Hasan, Md Mehedi ;
Manavalan, Balachandran ;
Shoombuatong, Watshara .
BRIEFINGS IN BIOINFORMATICS, 2021, 22 (06)
[9]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[10]   iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data [J].
Chen, Zhen ;
Zhao, Pei ;
Li, Fuyi ;
Marquez-Lago, Tatiana T. ;
Leier, Andre ;
Revote, Jerico ;
Zhu, Yan ;
Powell, David R. ;
Akutsu, Tatsuya ;
Webb, Geoffrey, I ;
Chou, Kuo-Chen ;
Smith, A. Ian ;
Daly, Roger J. ;
Li, Jian ;
Song, Jiangning .
BRIEFINGS IN BIOINFORMATICS, 2020, 21 (03) :1047-1057