A study and benchmark of DNcon: a method for protein residue-residue contact prediction using deep networks

被引:21
作者
Eickholt, Jesse [1 ]
Cheng, Jianlin [2 ,3 ,4 ]
机构
[1] Cent Michigan Univ, Dept Comp Sci, Mt Pleasant, MI 48859 USA
[2] Univ Missouri, Dept Comp Sci, Columbia, MO 65211 USA
[3] Univ Missouri, Inst Informat, Columbia, MO 65211 USA
[4] Univ Missouri, C Bond Life Sci Ctr, Columbia, MO 65211 USA
关键词
CORRELATED MUTATIONS; NEURAL-NETWORKS; MAP PREDICTION; INFORMATION; MODELS;
D O I
10.1186/1471-2105-14-S14-S12
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: In recent years, the use and importance of predicted protein residue-residue contacts has grown considerably with demonstrated applications such as drug design, protein tertiary structure prediction and model quality assessment. Nevertheless, reported accuracies in the range of 25-35% stubbornly remain the norm for sequence based, long range contact predictions on hard targets. This is in spite of a prolonged effort on behalf of the community to improve the performance of residue-residue contact prediction. A thorough study of the quality of current residue-residue contact predictions and the evaluation metrics used as well as an analysis of current methods is needed to stimulate further advancement in contact prediction and its application. Such a study will better explain the quality and nature of residue-residue contact predictions generated by current methods and as a result lead to better use of this contact information. Results: We evaluated several sequence based residue-residue contact predictors that participated in the tenth Critical Assessment of protein Structure Prediction (CASP) experiment. The evaluation was performed using standard assessment techniques such as those used by the official CASP assessors as well as two novel evaluation metrics (i.e., cluster accuracy and cluster count). An in-depth analysis revealed that while most residue-residue contact predictions generated are not accurate at the residue level, there is quite a strong contact signal present when allowing for less than residue level precision. Our residue-residue contact predictor, DNcon, performed particularly well achieving an accuracy of 66% for the top L/10 long range contacts when evaluated in a neighbourhood of size 2. The coverage of residue-residue contact areas was also greater with DNcon when compared to other methods. We also provide an analysis of DNcon with respect to its underlying architecture and features used for classification. Conclusions: Our novel evaluation metrics demonstrate that current residue-residue contact predictions do contain a strong contact signal and are of better quality than standard evaluation metrics indicate. Our method, DNcon, is a robust, state-of-the-art residue-residue sequence based contact predictor and excelled under a number of evaluation schemes. It is available as a web service at http://iris.rnet.missouri.edu/dncon/.
引用
收藏
页数:10
相关论文
共 26 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Solving the protein sequence metric problem [J].
Atchley, WR ;
Zhao, JP ;
Fernandes, AD ;
Drüke, T .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (18) :6395-6400
[3]   SCRATCH: a protein structure and structural feature prediction server [J].
Cheng, J ;
Randall, AZ ;
Sweredoski, MJ ;
Baldi, P .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W72-W76
[4]   Improved residue contact prediction using support vector machines and a large feature set [J].
Cheng, Jianlin ;
Baldi, Pierre .
BMC BIOINFORMATICS, 2007, 8 (1)
[5]   Deep architectures for protein contact map prediction [J].
Di Lena, Pietro ;
Nagata, Ken ;
Baldi, Pierre .
BIOINFORMATICS, 2012, 28 (19) :2449-2457
[6]   Optimal contact definition for reconstruction of Contact Maps [J].
Duarte, Jose M. ;
Sathyapriya, Rajagopal ;
Stehr, Henning ;
Filippis, Ioannis ;
Lappe, Michael .
BMC BIOINFORMATICS, 2010, 11
[7]   Predicting protein residue-residue contacts using deep networks and boosting [J].
Eickholt, Jesse ;
Cheng, Jianlin .
BIOINFORMATICS, 2012, 28 (23) :3066-3072
[8]   A conformation ensemble approach to protein residue-residue contact [J].
Eickholt, Jesse ;
Wang, Zheng ;
Cheng, Jianlin .
BMC STRUCTURAL BIOLOGY, 2011, 11
[9]   Assessment of domain boundary predictions and the prediction of intramolecular contacts in CASP8 [J].
Ezkurdia, Iakes ;
Grana, Osvaldo ;
Izarzugaza, Jose M. G. ;
Tress, Michael L. .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2009, 77 :196-209
[10]   Prediction of contact maps with neural networks and correlated mutations [J].
Fariselli, P ;
Olmea, O ;
Valencia, A ;
Casadio, R .
PROTEIN ENGINEERING, 2001, 14 (11) :835-843