Detecting distant-homology protein structures by aligning deep neural-network based contact maps

被引:34
|
作者
Zheng, Wei [1 ,2 ,3 ]
Wuyun, Qiqige [2 ,3 ,4 ]
Li, Yang [1 ]
Mortuza, S. M. [1 ]
Zhang, Chengxin [1 ]
Pearce, Robin [1 ]
Ruan, Jishou [2 ,3 ,5 ]
Zhang, Yang [1 ,6 ]
机构
[1] Univ Michigan, Dept Computat Med & Bioinformat, Ann Arbor, MI 48109 USA
[2] Nankai Univ, Coll Math Sci, Tianjin, Peoples R China
[3] Nankai Univ, LPMC, Tianjin, Peoples R China
[4] Michigan State Univ, Comp Sci & Engn Dept, E Lansing, MI 48824 USA
[5] Nankai Univ, State Key Lab Med Chem Biol, Tianjin, Peoples R China
[6] Univ Michigan, Dept Biol Chem, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会;
关键词
STRUCTURE PREDICTION; SEQUENCE; ALIGNMENT; FRAGMENTS; SEARCH;
D O I
10.1371/journal.pcbi.1007411
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Accurate prediction of atomic-level protein structure is important for annotating the biological functions of protein molecules and for designing new compounds to regulate the functions. Template-based modeling (TBM), which aims to construct structural models by copying and refining the structural frameworks of other known proteins, remains the most accurate method for protein structure prediction. Due to the difficulty in recognizing distant-homology templates, however, the accuracy of TBM decreases rapidly when the evolutionary relationship between the query and template vanishes. In this study, we propose a new method, CEthreader, which first predicts residue-residue contacts by coupling evolutionary precision matrices with deep residual convolutional neural-networks. The predicted contact maps are then integrated with sequence profile alignments to recognize structural templates from the PDB. The method was tested on two independent benchmark sets consisting collectively of 1,153 non-homologous protein targets, where CEthreader detected 176% or 36% more correct templates with a TM-score >0.5 than the best state-of-the-art profile- or contact-based threading methods, respectively, for the Hard targets that lacked homologous templates. Moreover, CEthreader was able to identify 114% or 20% more correct templates with the same Fold as the query, after excluding structures from the same SCOPe Superfamily, than the best profile- or contact-based threading methods. Detailed analyses show that the major advantage of CEthreader lies in the efficient coupling of contact maps with profile alignments, which helps recognize global fold of protein structures when the homologous relationship between the query and template is weak. These results demonstrate an efficient new strategy to combine ab initio contact map prediction with profile alignments to significantly improve the accuracy of template-based structure prediction, especially for distant-homology proteins. Author summary Despite decades of effort in computational method development, template-based modeling (TBM) still remains the most reliable approach to high-resolution protein structure prediction. Previous studies have shown that the PDB library is complete for single-domain proteins and TBM is in principle sufficient to solve the structure prediction problem if the most similar structure in the PDB could be reliably identified and used as template for model reconstruction. But in reality, the success of TBM depends on the availability of closely-homologous templates, where its accuracy and reliability decrease sharply when the evolutionary relationship between query and template becomes more distant. We developed a new threading approach, CEthreader, which allows for dynamic programing alignments of predicted contact-maps through eigen-decomposition. The large-scale benchmark tests show that the coupling of contact map with profile and secondary structure alignments through the proposed protocol can significantly improve the accuracy of template recognition for distantly-homologous protein targets.
引用
收藏
页数:27
相关论文
共 50 条
  • [1] DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins
    Zhang, Chengxin
    Zheng, Wei
    Mortuza, S. M.
    Li, Yang
    Zhang, Yang
    BIOINFORMATICS, 2020, 36 (07) : 2105 - 2112
  • [2] SAFoldNet: A Novel Tool for Discovering and Aligning Three-Dimensional Protein Structures Based on a Neural Network
    Petrovskiy, Denis V.
    Nikolsky, Kirill S.
    Rudnev, Vladimir R.
    Kulikova, Liudmila I.
    Butkova, Tatiana V.
    Malsagova, Kristina A.
    Kopylov, Arthur T.
    Kaysheva, Anna L.
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (19)
  • [3] NEURAL-NETWORK SYSTEM FOR THE EVALUATION OF SIDE-CHAIN PACKING IN PROTEIN STRUCTURES
    MILIK, M
    KOLINSKI, A
    SKOLNICK, J
    PROTEIN ENGINEERING, 1995, 8 (03): : 225 - 236
  • [4] NEURAL-NETWORK PROCESS MODELS BASED ON LINEAR-MODEL STRUCTURES
    SCOTT, GM
    RAY, WH
    NEURAL COMPUTATION, 1994, 6 (04) : 718 - 738
  • [5] Deep Convolutional Neural Networks for Detecting Secondary Structures in Protein Density Maps from Cryo-Electron Microscopy
    Li, Rongjian
    Si, Dong
    Zeng, Tao
    Ji, Shuiwang
    He, Jing
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 41 - 46
  • [6] Predicting residue-residue contact maps by a two-layer, integrated neural-network method
    Xue, Bin
    Faraggi, Eshel
    Zhou, Yaoqi
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2009, 76 (01) : 176 - 183
  • [7] Detecting process mean shift in the presence of autocorrelation: a neural-network based monitoring scheme
    Hwarng, HB
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2004, 42 (03) : 573 - 595
  • [8] Detecting Malware with an Ensemble Method Based on Deep Neural Network
    Yan, Jinpei
    Qi, Yong
    Rao, Qifan
    SECURITY AND COMMUNICATION NETWORKS, 2018,
  • [9] A fuzzy sets based generalization of contact maps for the overlap of protein structures
    Pelta, D
    Krasnogor, N
    Bousono-Calzon, C
    Verdegay, JL
    Hirst, J
    Burke, E
    FUZZY SETS AND SYSTEMS, 2005, 152 (01) : 103 - 123
  • [10] A neural-network based estimator to search for primordial non-Gaussianity in Planck CMB maps
    Novaes, C. P.
    Bernui, A.
    Ferreira, I. S.
    Wuensche, C. A.
    JOURNAL OF COSMOLOGY AND ASTROPARTICLE PHYSICS, 2015, (09):