MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins

被引:253
作者
Disfani, Fatemeh Miri [1 ]
Hsu, Wei-Lun [2 ,3 ]
Mizianty, Marcin J. [1 ]
Oldfield, Christopher J. [2 ,3 ]
Xue, Bin [4 ]
Dunker, A. Keith [2 ,3 ]
Uversky, Vladimir N. [4 ,5 ]
Kurgan, Lukasz [1 ]
机构
[1] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6G 2V4, Canada
[2] Indiana Univ, Ctr Computat Biol & Bioinformat, Indianapolis, IN 46202 USA
[3] Indiana Univ, Dept Biochem & Mol Biol, Indianapolis, IN 46202 USA
[4] Univ S Florida, Dept Mol Med, Tampa, FL 33612 USA
[5] Russian Acad Sci, Inst Biol Instrumentat, Pushchino 142290, Russia
基金
加拿大自然科学与工程研究理事会;
关键词
MOLECULAR RECOGNITION FEATURES; WEB-SERVER; INTRINSIC DISORDER; NEURAL-NETWORK; PSI-BLAST; DATABASE; DOMAINS; CONSERVATION; PRINCIPLES; RESIDUES;
D O I
10.1093/bioinformatics/bts209
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Molecular recognition features (MoRFs) are short binding regions located within longer intrinsically disordered regions that bind to protein partners via disorder-to-order transitions. MoRFs are implicated in important processes including signaling and regulation. However, only a limited number of experimentally validated MoRFs is known, which motivates development of computational methods that predict MoRFs from protein chains. Results: We introduce a new MoRF predictor, MoRFpred, which identifies all MoRF types (alpha, beta, coil and complex). We develop a comprehensive dataset of annotated MoRFs to build and empirically compare our method. MoRFpred utilizes a novel design in which annotations generated by sequence alignment are fused with predictions generated by a Support Vector Machine (SVM), which uses a custom designed set of sequence-derived features. The features provide information about evolutionary profiles, selected physiochemical properties of amino acids, and predicted disorder, solvent accessibility and B-factors. Empirical evaluation on several datasets shows that MoRFpred outperforms related methods: alpha-MoRF-Pred that predicts alpha-MoRFs and ANCHOR which finds disordered regions that become ordered when bound to a globular partner. We show that our predicted (new) MoRF regions have non-random sequence similarity with native MoRFs. We use this observation along with the fact that predictions with higher probability are more accurate to identify putative MoRF regions. We also identify a few sequence-derived hallmarks of MoRFs. They are characterized by dips in the disorder predictions and higher hydrophobicity and stability when compared to adjacent (in the chain) residues.
引用
收藏
页码:I75 / I83
页数:9
相关论文
共 46 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Principal eigenvector of contact matrices and hydrophobicity profiles in proteins [J].
Bastolla, U ;
Porto, M ;
Roman, HE ;
Vendruscolo, M .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 58 (01) :22-30
[3]   The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data [J].
Berman, Helen ;
Henrick, Kim ;
Nakamura, Haruki ;
Markley, John L. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D301-D303
[4]   Studies of the RNA degradosome-organizing domain of the Escherichia coli ribonuclease RNase E [J].
Callaghan, AJ ;
Aurikko, JP ;
IIag, LL ;
Grossmann, JG ;
Chandran, V ;
Kühnel, K ;
Poljak, L ;
Carpousis, AJ ;
Robinson, CV ;
Symmons, MF ;
Luisi, BF .
JOURNAL OF MOLECULAR BIOLOGY, 2004, 340 (05) :965-979
[5]   Conservation of intrinsic disorder in protein domains and families: I. A database of conserved predicted disordered regions [J].
Chen, JW ;
Romero, P ;
Uversky, VN ;
Dunker, AK .
JOURNAL OF PROTEOME RESEARCH, 2006, 5 (04) :879-887
[6]   Conservation of intrinsic disorder in protein domains and families: II. Functions of conserved disorder [J].
Chen, JW ;
Romero, P ;
Uversky, VN ;
Dunker, AK .
JOURNAL OF PROTEOME RESEARCH, 2006, 5 (04) :888-898
[7]   Prediction of protein B-factors using multi-class bounded SVM [J].
Chen, Peng ;
Wang, Bing ;
Wong, Hau-San ;
Huang, De-Shuang .
PROTEIN AND PEPTIDE LETTERS, 2007, 14 (02) :185-190
[8]   Mining α-helix-forming molecular recognition features with cross species sequence alignments [J].
Cheng, Yugong ;
Oldfield, Christopher J. ;
Meng, Jingwei ;
Romero, Pedro ;
Uversky, Vladimir N. ;
Dunker, A. Keith .
BIOCHEMISTRY, 2007, 46 (47) :13468-13477
[9]   SLiMDisc: short, linear motif discovery, correcting for common evolutionary descent [J].
Davey, Norman E. ;
Shields, Denis C. ;
Edwards, Richard J. .
NUCLEIC ACIDS RESEARCH, 2006, 34 (12) :3546-3554
[10]   IUPred:: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content [J].
Dosztányi, Z ;
Csizmok, V ;
Tompa, P ;
Simon, I .
BIOINFORMATICS, 2005, 21 (16) :3433-3434