Analysis of several key factors influencing deep learning-based inter-residue contact prediction

被引:22
|
作者
Wu, Tianqi [1 ]
Hou, Jie [1 ]
Adhikari, Badri [2 ]
Cheng, Jianlin [1 ]
机构
[1] Univ Missouri, Dept Elect Engn & Comp Sci, Columbia, MO 65211 USA
[2] Univ Missouri, Dept Math & Comp Sci, St Louis, MO 63121 USA
关键词
PROTEIN; SEQUENCE;
D O I
10.1093/bioinformatics/btz679
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Deep learning has become the dominant technology for protein contact prediction. However, the factors that affect the performance of deep learning in contact prediction have not been systematically investigated. Results: We analyzed the results of our three deep learning-based contact prediction methods (MULTICOM-CLUSTER, MULTICOM-CONSTRUCT and MULTICOM-NOVEL) in the CASP13 experiment and identified several key factors [i.e. deep learning technique, multiple sequence alignment (MSA), distance distribution prediction and domain-based contact integration] that influenced the contact prediction accuracy. We compared our convolutional neural network (CNN)-based contact prediction methods with three coevolution-based methods on 75 CASP13 targets consisting of 108 domains. We demonstrated that the CNN-based multi-distance approach was able to leverage global coevolutionary coupling patterns comprised of multiple correlated contacts for more accurate contact prediction than the local coevolution-based methods, leading to a substantial increase of precision by 19.2 percentage points. We also tested different alignment methods and domain-based contact prediction with the deep learning contact predictors. The comparison of the three methods showed deeper sequence alignments and the integration of domain-based contact prediction with the full-length contact prediction improved the performance of contact prediction. Moreover, we demonstrated that the domain-based contact prediction based on a novel ab initio approach of parsing domains from MSAs alone without using known protein structures was a simple, fast approach to improve contact prediction. Finally, we showed that predicting the distribution of inter-residue distances in multiple distance intervals could capture more structural information and improve binary contact prediction.
引用
收藏
页码:1091 / 1098
页数:8
相关论文
共 50 条
  • [1] Inter-Residue Distance Prediction From Duet Deep Learning Models
    Zhang, Huiling
    Huang, Ying
    Bei, Zhendong
    Ju, Zhen
    Meng, Jintao
    Hao, Min
    Zhang, Jingjing
    Zhang, Haiping
    Xi, Wenhui
    FRONTIERS IN GENETICS, 2022, 13
  • [2] Prediction of inter-residue contact clusters from hydrophobic cores
    Chen, Peng
    Liu, Chunmei
    Burge, Legand
    Mahmood, Mohammad
    Southerland, William
    Gloster, Clay
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2010, 4 (06) : 722 - 734
  • [3] Prediction of Inter-residue Contact Clusters from Hydrophobic Cores
    Chen, Peng
    Liu, Chunmei
    Burge, Legand
    Mohammad, Mahmood
    Southerland, Bill
    Gloster, Clay
    SEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2008, : 703 - +
  • [4] Enhancing protein inter-residue real distance prediction by scrutinising deep learning models
    Rahman, Julia
    Newton, M. A. Hakim
    Ben Islam, Md Khaled
    Sattar, Abdul
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [5] Prediction of Inter-Residue Multiple Distances and Exploration of Protein Multiple Conformations by Deep Learning
    Zhang, Fujin
    Li, Zhangwei
    Zhao, Kailong
    Zhao, Pengxin
    Zhang, Guijun
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2024, 21 (06) : 1731 - 1739
  • [6] Enhancing protein inter-residue real distance prediction by scrutinising deep learning models
    Julia Rahman
    M. A. Hakim Newton
    Md Khaled Ben Islam
    Abdul Sattar
    Scientific Reports, 12
  • [7] DeepCDpred: Inter-residue distance and contact prediction for improved prediction of protein structure
    Ji, Shuangxi
    Oruc, Tugce
    Mead, Liam
    Rehman, Muhammad Fayyaz
    Thomas, Christopher Morton
    Butterworth, Sam
    Winn, Peter James
    PLOS ONE, 2019, 14 (01):
  • [8] Analyzing effect of quadruple multiple sequence alignments on deep learning based protein inter-residue distance prediction
    Aashish Jain
    Genki Terashi
    Yuki Kagaya
    Sai Raghavendra Maddhuri Venkata Subramaniya
    Charles Christoffer
    Daisuke Kihara
    Scientific Reports, 11
  • [9] Analyzing effect of quadruple multiple sequence alignments on deep learning based protein inter-residue distance prediction
    Jain, Aashish
    Terashi, Genki
    Kagaya, Yuki
    Subramaniya, Sai Raghavendra Maddhuri Venkata
    Christoffer, Charles
    Kihara, Daisuke
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [10] New Labeling Methods for Deep Learning Real-Valued Inter-Residue Distance Prediction
    Barger, Jacob
    Adhikari, Badri
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (06) : 3586 - 3594