CNNcon: Improved Protein Contact Maps Prediction Using Cascaded Neural Networks

被引:15
作者
Ding, Wang [1 ]
Xie, Jiang [1 ,2 ,3 ]
Dai, Dongbo [1 ]
Zhang, Huiran [1 ]
Xie, Hao [4 ]
Zhang, Wu [1 ,2 ]
机构
[1] Shanghai Univ, Sch Engn & Comp Sci, Shanghai, Peoples R China
[2] Shanghai Univ, Inst Syst Biol, Shanghai, Peoples R China
[3] Univ Calif Irvine, Dept Math, Irvine, CA 92717 USA
[4] Wuhan Univ, Coll Stomatol, Wuhan 430072, Peoples R China
来源
PLOS ONE | 2013年 / 8卷 / 04期
关键词
RESIDUE CONTACTS; CORRELATED MUTATIONS;
D O I
10.1371/journal.pone.0061533
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Backgrounds: Despite continuing progress in X-ray crystallography and high-field NMR spectroscopy for determination of three-dimensional protein structures, the number of unsolved and newly discovered sequences grows much faster than that of determined structures. Protein modeling methods can possibly bridge this huge sequence-structure gap with the development of computational science. A grand challenging problem is to predict three-dimensional protein structure from its primary structure (residues sequence) alone. However, predicting residue contact maps is a crucial and promising intermediate step towards final three-dimensional structure prediction. Better predictions of local and non-local contacts between residues can transform protein sequence alignment to structure alignment, which can finally improve template based three-dimensional protein structure predictors greatly. Methods: CNNcon, an improved multiple neural networks based contact map predictor using six sub-networks and one final cascade-network, was developed in this paper. Both the sub-networks and the final cascade-network were trained and tested with their corresponding data sets. While for testing, the target protein was first coded and then input to its corresponding sub-networks for prediction. After that, the intermediate results were input to the cascade-network to finish the final prediction. Results: The CNNcon can accurately predict 58.86% in average of contacts at a distance cutoff of 8 angstrom for proteins with lengths ranging from 51 to 450. The comparison results show that the present method performs better than the compared state-of-the-art predictors. Particularly, the prediction accuracy keeps steady with the increase of protein sequence length. It indicates that the CNNcon overcomes the thin density problem, with which other current predictors have trouble. This advantage makes the method valuable to the prediction of long length proteins. As a result, the effective prediction of long length proteins could be possible by the CNNcon.
引用
收藏
页数:7
相关论文
共 40 条
  • [1] The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling
    Arnold, K
    Bordoli, L
    Kopp, J
    Schwede, T
    [J]. BIOINFORMATICS, 2006, 22 (02) : 195 - 201
  • [2] Bartoli Lisa, 2008, V413, P199
  • [3] Bates PA, 2001, PROTEINS, P39
  • [4] Chen J, 2012, J CONVERGENCE INFORM, V7, P39
  • [5] Improved residue contact prediction using support vector machines and a large feature set
    Cheng, Jianlin
    Baldi, Pierre
    [J]. BMC BIOINFORMATICS, 2007, 8 (1)
  • [6] Deep architectures for protein contact map prediction
    Di Lena, Pietro
    Nagata, Ken
    Baldi, Pierre
    [J]. BIOINFORMATICS, 2012, 28 (19) : 2449 - 2457
  • [7] The HSSP database of protein structure sequence alignments and family profiles
    Dodge, C
    Schneider, R
    Sander, C
    [J]. NUCLEIC ACIDS RESEARCH, 1998, 26 (01) : 313 - 315
  • [8] Tools for comparative protein structure modeling and analysis
    Eswar, N
    John, B
    Mirkovic, N
    Fiser, A
    Ilyin, VA
    Pieper, U
    Stuart, AC
    Marti-Renom, MA
    Madhusudhan, MS
    Yerkovich, B
    Sali, A
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (13) : 3375 - 3380
  • [9] CAFASP3 in the spotlight of EVA
    Eyrich, VA
    Przybylski, D
    Koh, IYY
    Grana, O
    Pazos, F
    Valencia, A
    Rost, B
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 : 548 - 560
  • [10] Progress in predicting inter-residue contacts of proteins with neural networks and correlated mutations
    Fariselli, P
    Olmea, O
    Valencia, A
    Casadio, R
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2001, : 157 - 162