Prediction of Therapeutic Peptides Using Machine Learning: Computational Models, Datasets, and Feature Encodings

被引:17
作者
Attique, Muhammad [1 ,2 ]
Farooq, Muhammad Shoaib [1 ]
Khelifi, Adel [3 ]
Abid, Adnan [1 ]
机构
[1] Univ Management & Technol, Dept Comp Sci, Lahore 54000, Pakistan
[2] Univ Gujrat, Dept Informat Technol, Gujrat City 50700, Pakistan
[3] Abu Dhabi Univ, Dept Comp Sci & Informat Technol, Abu Dhabi, U Arab Emirates
关键词
Peptides; Machine learning; Encoding; Predictive models; Computational modeling; Drugs; Spectroscopy; Anti-angiogenic; anti-cancer; anti-inflammatory; anti-microbial; feature extraction; encodings; machine learning; peptide therapeutics; AMINO-ACID-COMPOSITION; CHAOS GAME REPRESENTATION; IMMUNE EPITOPE DATABASE; WEB SERVER; ANTIMICROBIAL PEPTIDES; ANTICANCER PEPTIDES; NEURAL-NETWORKS; CD-HIT; PROTEIN; CLASSIFICATION;
D O I
10.1109/ACCESS.2020.3015792
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Peptides, short-chained amino acids, have shown great potentials toward the investigation and evolution of novel medications for treatment or therapy. The wet-lab based discovery of potential therapeutic peptides and eventually drug development is a hard and time-consuming process. The computational prediction using machine learning (ML) methods can expedite and facilitate the discovery process of potential prospects with therapeutic effects. ML approaches have been practiced favorably and extensively within the area of proteins, DNA, and RNA to discover the hidden features and functional activities, moreover, recently been utilized for functional discovery of peptides for various therapeutics. In this paper, a systematic literature review (SLR) has been presented to recognize the data-sources, ML classifiers, and encoding schemes being utilized in the state-of-the-art computational models to predict therapeutic peptides. To conduct the SLR, fourty-one research articles have been selected carefully based on well-defined selection criteria. To the best of our knowledge, there is no such SLR available that provides a comprehensive review in this domain. In this article, we have proposed a taxonomy based on identified feature encodings, which may offer relational understandings to researchers. Similarly, the framework model for the computational prediction of the therapeutic peptides has been introduced to characterize the best practices and levels involved in the development of peptide prediction models. Lastly, common issues and challenges have been discussed to facilitate the researchers with encouraging future directions in the field of computational prediction of therapeutic peptides.
引用
收藏
页码:148570 / 148594
页数:25
相关论文
共 167 条
[1]   iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space [J].
Akbar, Shahid ;
Hayat, Maqsood ;
Iqbal, Muhammad ;
Jan, Mian Ahmad .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2017, 79 :62-70
[2]   On learning algorithm selection for classification [J].
Ali, S ;
Smith, KA .
APPLIED SOFT COMPUTING, 2006, 6 (02) :119-138
[3]  
Altschul S. F., 1997, THEORETICAL COMPUTAT
[4]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[5]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[6]  
[Anonymous], 2013, P INT C BIOINF COMP
[7]  
[Anonymous], 2014, PATTERN RECOGNITION
[8]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
[9]   KAnalyze: a fast versatile pipelined K-mer toolkit [J].
Audano, Peter ;
Vannberg, Fredrik .
BIOINFORMATICS, 2014, 30 (14) :2070-2072
[10]   MEME SUITE: tools for motif discovery and searching [J].
Bailey, Timothy L. ;
Boden, Mikael ;
Buske, Fabian A. ;
Frith, Martin ;
Grant, Charles E. ;
Clementi, Luca ;
Ren, Jingyuan ;
Li, Wilfred W. ;
Noble, William S. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W202-W208