Predicting protein-peptide binding sites with a deep convolutional neural network

被引:28
作者
Wardah, Wafaa [1 ]
Dehzangi, Abdollah [2 ]
Taherzadeh, Ghazaleh [3 ]
Rashid, Mahmood A. [4 ,5 ]
Khan, M. G. M. [1 ]
Tsunoda, Tatsuhiko [6 ,7 ,8 ,9 ]
Sharma, Alok [5 ,7 ,8 ,10 ]
机构
[1] Univ South Pacific, Fac Sci, Sch Comp Informat & Math Sci, Suva, Fiji
[2] Morgan State Univ, Dept Comp Sci, Baltimore, MD 21239 USA
[3] Univ Maryland, Inst Biosci & Biotechnol Res, College Pk, MD USA
[4] Victoria Univ, Inst Sustainable Ind & Liveable Cities, Melbourne, Vic, Australia
[5] Griffith Univ, Inst Integrated & Intelligent Syst, Nathan, Qld, Australia
[6] Tokyo Med & Dent Univ, Med Res Inst, Dept Med Sci Math, Tokyo, Japan
[7] RIKEN, Lab Med Sci Math, Ctr Integrat Med Sci, Yokohama, Kanagawa, Japan
[8] JST, CREST, Tokyo 1138510, Japan
[9] Univ Tokyo, Grad Sch Sci, Dept Biol Sci, Lab Med Sci Math, Tokyo, Japan
[10] Univ South Pacific, Sch Engn & Phys, Suva, Fiji
关键词
Protein-peptide binding; Artificial intelligence; Deep learning; Convolutional neural network; Protein sequence; AMINO-ACID; SOLVENT; DATABASE; MODELS; DNA;
D O I
10.1016/j.jtbi.2020.110278
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Motivation: Interactions between proteins and peptides influence biological functions. Predicting such bio-molecular interactions can lead to faster disease prevention and help in drug discovery. Experimental methods for determining protein-peptide binding sites are costly and time-consuming. Therefore, computational methods have become prevalent. However, existing models show extremely low detection rates of actual peptide binding sites in proteins. To address this problem, we employed a two-stage technique first, we extracted the relevant features from protein sequences and transformed them into images applying a novel method and then, we applied a convolutional neural network to identify the peptide binding sites in proteins. Results: We found that our approach achieves 67% sensitivity or recall (true positive rate) surpassing existing methods by over 35%. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:8
相关论文
共 48 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] [Anonymous], NUCL ACIDS RES
  • [3] [Anonymous], PREDICTION PROTEIN S
  • [4] [Anonymous], J COMPUTATIONAL CHEM
  • [5] [Anonymous], DRUG DISCOV TODAY
  • [6] [Anonymous], STRUCTURAL BIOINFORM
  • [7] [Anonymous], 2000, The Nature of Statistical Learning Theory
  • [8] [Anonymous], 2001, MACH LEARN
  • [9] [Anonymous], DEEPMHC DEEP CONVOLU
  • [10] [Anonymous], OP SOURC MACH LEARN