MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins

被引：253

作者：

Disfani, Fatemeh Miri ^{[1
]}

Hsu, Wei-Lun ^{[2
,3
]}

Mizianty, Marcin J. ^{[1
]}

Oldfield, Christopher J. ^{[2
,3
]}

Xue, Bin ^{[4
]}

Dunker, A. Keith ^{[2
,3
]}

Uversky, Vladimir N. ^{[4
,5
]}

Kurgan, Lukasz ^{[1
]}

机构：

[1] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6G 2V4, Canada

[2] Indiana Univ, Ctr Computat Biol & Bioinformat, Indianapolis, IN 46202 USA

[3] Indiana Univ, Dept Biochem & Mol Biol, Indianapolis, IN 46202 USA

[4] Univ S Florida, Dept Mol Med, Tampa, FL 33612 USA

[5] Russian Acad Sci, Inst Biol Instrumentat, Pushchino 142290, Russia

来源：

BIOINFORMATICS | 2012年 / 28卷 / 12期

基金：

加拿大自然科学与工程研究理事会;

关键词：

MOLECULAR RECOGNITION FEATURES; WEB-SERVER; INTRINSIC DISORDER; NEURAL-NETWORK; PSI-BLAST; DATABASE; DOMAINS; CONSERVATION; PRINCIPLES; RESIDUES;

D O I：

10.1093/bioinformatics/bts209

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Motivation: Molecular recognition features (MoRFs) are short binding regions located within longer intrinsically disordered regions that bind to protein partners via disorder-to-order transitions. MoRFs are implicated in important processes including signaling and regulation. However, only a limited number of experimentally validated MoRFs is known, which motivates development of computational methods that predict MoRFs from protein chains. Results: We introduce a new MoRF predictor, MoRFpred, which identifies all MoRF types (alpha, beta, coil and complex). We develop a comprehensive dataset of annotated MoRFs to build and empirically compare our method. MoRFpred utilizes a novel design in which annotations generated by sequence alignment are fused with predictions generated by a Support Vector Machine (SVM), which uses a custom designed set of sequence-derived features. The features provide information about evolutionary profiles, selected physiochemical properties of amino acids, and predicted disorder, solvent accessibility and B-factors. Empirical evaluation on several datasets shows that MoRFpred outperforms related methods: alpha-MoRF-Pred that predicts alpha-MoRFs and ANCHOR which finds disordered regions that become ordered when bound to a globular partner. We show that our predicted (new) MoRF regions have non-random sequence similarity with native MoRFs. We use this observation along with the fact that predictions with higher probability are more accurate to identify putative MoRF regions. We also identify a few sequence-derived hallmarks of MoRFs. They are characterized by dips in the disorder predictions and higher hydrophobicity and stability when compared to adjacent (in the chain) residues.

引用

页码：I75 / I83

页数：9

共 46 条

[1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].

Altschul, SF ;

Madden, TL ;

Schaffer, AA ;

Zhang, JH ;

Zhang, Z ;

Miller, W ;

Lipman, DJ .

NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402

[2] Principal eigenvector of contact matrices and hydrophobicity profiles in proteins [J].

Bastolla, U ;

Porto, M ;

Roman, HE ;

Vendruscolo, M .

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 58 (01) :22-30

[3] The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data [J].

Berman, Helen ;

Henrick, Kim ;

Nakamura, Haruki ;

Markley, John L. .

NUCLEIC ACIDS RESEARCH, 2007, 35 :D301-D303

[4] Studies of the RNA degradosome-organizing domain of the Escherichia coli ribonuclease RNase E [J].

Callaghan, AJ ;

Aurikko, JP ;

IIag, LL ;

Grossmann, JG ;

Chandran, V ;

Kühnel, K ;

Poljak, L ;

Carpousis, AJ ;

Robinson, CV ;

Symmons, MF ;

Luisi, BF .

JOURNAL OF MOLECULAR BIOLOGY, 2004, 340 (05) :965-979

[5] Conservation of intrinsic disorder in protein domains and families: I. A database of conserved predicted disordered regions [J].

Chen, JW ;

Romero, P ;

Uversky, VN ;

Dunker, AK .

JOURNAL OF PROTEOME RESEARCH, 2006, 5 (04) :879-887

[6] Conservation of intrinsic disorder in protein domains and families: II. Functions of conserved disorder [J].

Chen, JW ;

Romero, P ;

Uversky, VN ;

Dunker, AK .

JOURNAL OF PROTEOME RESEARCH, 2006, 5 (04) :888-898

[7] Prediction of protein B-factors using multi-class bounded SVM [J].

Chen, Peng ;

Wang, Bing ;

Wong, Hau-San ;

Huang, De-Shuang .

PROTEIN AND PEPTIDE LETTERS, 2007, 14 (02) :185-190

[8] Mining α-helix-forming molecular recognition features with cross species sequence alignments [J].

Cheng, Yugong ;

Oldfield, Christopher J. ;

Meng, Jingwei ;

Romero, Pedro ;

Uversky, Vladimir N. ;

Dunker, A. Keith .

BIOCHEMISTRY, 2007, 46 (47) :13468-13477

[9] SLiMDisc: short, linear motif discovery, correcting for common evolutionary descent [J].

Davey, Norman E. ;

Shields, Denis C. ;

Edwards, Richard J. .

NUCLEIC ACIDS RESEARCH, 2006, 34 (12) :3546-3554

[10] IUPred:: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content [J].

Dosztányi, Z ;

Csizmok, V ;

Tompa, P ;

Simon, I .

BIOINFORMATICS, 2005, 21 (16) :3433-3434

← 1 2 3 4 5 →