Prediction of Therapeutic Peptides Using Machine Learning: Computational Models, Datasets, and Feature Encodings

被引:17
作者
Attique, Muhammad [1 ,2 ]
Farooq, Muhammad Shoaib [1 ]
Khelifi, Adel [3 ]
Abid, Adnan [1 ]
机构
[1] Univ Management & Technol, Dept Comp Sci, Lahore 54000, Pakistan
[2] Univ Gujrat, Dept Informat Technol, Gujrat City 50700, Pakistan
[3] Abu Dhabi Univ, Dept Comp Sci & Informat Technol, Abu Dhabi, U Arab Emirates
来源
IEEE ACCESS | 2020年 / 8卷 / 08期
关键词
Peptides; Machine learning; Encoding; Predictive models; Computational modeling; Drugs; Spectroscopy; Anti-angiogenic; anti-cancer; anti-inflammatory; anti-microbial; feature extraction; encodings; machine learning; peptide therapeutics; AMINO-ACID-COMPOSITION; CHAOS GAME REPRESENTATION; IMMUNE EPITOPE DATABASE; WEB SERVER; ANTIMICROBIAL PEPTIDES; ANTICANCER PEPTIDES; NEURAL-NETWORKS; CD-HIT; PROTEIN; CLASSIFICATION;
D O I
10.1109/ACCESS.2020.3015792
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Peptides, short-chained amino acids, have shown great potentials toward the investigation and evolution of novel medications for treatment or therapy. The wet-lab based discovery of potential therapeutic peptides and eventually drug development is a hard and time-consuming process. The computational prediction using machine learning (ML) methods can expedite and facilitate the discovery process of potential prospects with therapeutic effects. ML approaches have been practiced favorably and extensively within the area of proteins, DNA, and RNA to discover the hidden features and functional activities, moreover, recently been utilized for functional discovery of peptides for various therapeutics. In this paper, a systematic literature review (SLR) has been presented to recognize the data-sources, ML classifiers, and encoding schemes being utilized in the state-of-the-art computational models to predict therapeutic peptides. To conduct the SLR, fourty-one research articles have been selected carefully based on well-defined selection criteria. To the best of our knowledge, there is no such SLR available that provides a comprehensive review in this domain. In this article, we have proposed a taxonomy based on identified feature encodings, which may offer relational understandings to researchers. Similarly, the framework model for the computational prediction of the therapeutic peptides has been introduced to characterize the best practices and levels involved in the development of peptide prediction models. Lastly, common issues and challenges have been discussed to facilitate the researchers with encouraging future directions in the field of computational prediction of therapeutic peptides.
引用
收藏
页码:148570 / 148594
页数:25
相关论文
共 167 条
  • [1] Agarwala R, 2018, NUCLEIC ACIDS RES, V46, pD8, DOI [10.1093/nar/gks1189, 10.1093/nar/gkx1095, 10.1093/nar/gkq1172]
  • [2] iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space
    Akbar, Shahid
    Hayat, Maqsood
    Iqbal, Muhammad
    Jan, Mian Ahmad
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2017, 79 : 62 - 70
  • [3] On learning algorithm selection for classification
    Ali, S
    Smith, KA
    [J]. APPLIED SOFT COMPUTING, 2006, 6 (02) : 119 - 138
  • [4] Altschul S. F., 1997, THEORETICAL COMPUTAT
  • [5] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [6] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [7] [Anonymous], 2011, Ann. IEEE India Conf, DOI DOI 10.1109/INDCON.2011.6139332
  • [8] [Anonymous], 2014, PATTERN RECOGNITION
  • [9] Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
  • [10] KAnalyze: a fast versatile pipelined K-mer toolkit
    Audano, Peter
    Vannberg, Fredrik
    [J]. BIOINFORMATICS, 2014, 30 (14) : 2070 - 2072