Protein threading using residue co-variation and deep learning

被引:53
作者
Zhu, Jianwei [1 ,2 ,3 ]
Wang, Sheng [1 ]
Bu, Dongbo [2 ,3 ]
Xu, Jinbo [1 ]
机构
[1] Toyota Technol Inst, Chicago, IL 60637 USA
[2] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 626011, Peoples R China
[3] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金; 美国国家卫生研究院;
关键词
STRUCTURE PREDICTION; FOLD RECOGNITION; STRUCTURE INFORMATION; ALIGNMENT; CONTACTS;
D O I
10.1093/bioinformatics/bty278
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Template-based modeling, including homology modeling and protein threading, is a popular method for protein 3D structure prediction. However, alignment generation and template selection for protein sequences without close templates remain very challenging. Results: We present a new method called DeepThreader to improve protein threading, including both alignment generation and template selection, by making use of deep learning (DL) and residue co-variation information. Our method first employs DL to predict inter-residue distance distribution from residue co-variation and sequential information (e.g. sequence profile and predicted secondary structure), and then builds sequence-template alignment by integrating predicted distance information and sequential features through an ADMM algorithm. Experimental results suggest that predicted inter-residue distance is helpful to both protein alignment and template selection especially for protein sequences without very close templates, and that our method outperforms currently popular homology modeling method HHpred and threading method CNFpred by a large margin and greatly outperforms the latest contact-assisted protein threading method EigenTHREADER.
引用
收藏
页码:263 / 273
页数:11
相关论文
共 41 条
[1]  
[Anonymous], PLOS COMPUT BIOL
[2]   Protein structure prediction and structural genomics [J].
Baker, D ;
Sali, A .
SCIENCE, 2001, 294 (5540) :93-96
[3]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[4]   Eigen THREADER: analogous protein fold recognition by efficient contact map threading [J].
Buchan, Daniel W. A. ;
Jones, David T. .
BIOINFORMATICS, 2017, 33 (17) :2684-2690
[5]   A multi-template combination algorithm for protein comparative modeling [J].
Cheng, Jianlin .
BMC STRUCTURAL BIOLOGY, 2008, 8
[6]   Relationship between multiple sequence alignments and quality of protein comparative models [J].
Cozzetto, D ;
Tramontano, A .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 58 (01) :151-157
[7]   The Protein-Folding Problem, 50 Years On [J].
Dill, Ken A. ;
MacCallum, Justin L. .
SCIENCE, 2012, 338 (6110) :1042-1046
[8]   VITERBI ALGORITHM [J].
FORNEY, GD .
PROCEEDINGS OF THE IEEE, 1973, 61 (03) :268-278
[9]   DeepSF: deep convolutional neural network for mapping protein sequences to folds [J].
Hou, Jie ;
Adhikari, Badri ;
Cheng, Jianlin .
BIOINFORMATICS, 2018, 34 (08) :1295-1303
[10]   Improving Protein Fold Recognition by Deep Learning Networks [J].
Jo, Taeho ;
Hou, Jie ;
Eickholt, Jesse ;
Cheng, Jianlin .
SCIENTIFIC REPORTS, 2015, 5