PRINTR: Prediction of RNA binding sites in proteins using SVM and profiles

被引:56
作者
Wang, Y. [1 ]
Xue, Z. [2 ]
Shen, G. [2 ]
Xu, J. [3 ]
机构
[1] Huazhong Univ Sci & Technol, Inst Biochem & Biophys, Sch Life Sci, Wuhan 430074, Peoples R China
[2] Huazhong Univ Sci & Technol, Software Coll, Wuhan 430074, Peoples R China
[3] Huazhong Univ Sci & Technol, Dept Control Sci & Engn, Wuhan 430074, Peoples R China
关键词
protein-RNA interactions; RNA-binding sites; support vector machine; multiple sequence alignment;
D O I
10.1007/s00726-007-0634-9
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein-RNA interactions play a key role in a number of biological processes such as protein synthesis, mRNA processing, assembly and function of ribosomes and eukaryotic spliceosomes. A reliable identification of RNA-binding sites in RNA-binding proteins is important for functional annotation and site-directed mutagenesis. We developed a novel method for the prediction of protein residues that interact with RNA using support vector machine (SVM) and position-specific scoring matrices (PSSMs). Two cases have been considered in the prediction of protein residues at RNA-binding surfaces. One is given the sequence information of a protein chain that is known to interact with RNA; the other is given the structural information. Thus, five different inputs have been tested. Coupled with PSI-BLAST profiles and predicted secondary structure, the present approach yields a Matthews correlation coefficient (MCC) of 0.432 by a 7-fold cross-validation, which is the best among all previous reported RNA-binding sites prediction methods. When given the structural information, we have obtained the MCC value of 0.457, with PSSMs, observed secondary structure and solvent accessibility information assigned by DSSP as input. A web server implementing the prediction method is available at the following URL: http://210.42.106.80/printr/.
引用
收藏
页码:295 / 302
页数:8
相关论文
共 73 条
[1]   PSSM-based prediction of DNA binding sites in proteins [J].
Ahmad, S ;
Sarai, A .
BMC BIOINFORMATICS, 2005, 6 (1)
[2]   Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information [J].
Ahmad, S ;
Gromiha, MM ;
Sarai, A .
BIOINFORMATICS, 2004, 20 (04) :477-486
[3]   Real value prediction of solvent accessibility from amino acid sequence [J].
Ahmad, S ;
Gromiha, MM ;
Sarai, A .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 50 (04) :629-635
[4]   Structure-based analysis of Protein-RNA interactions using the program ENTANGLE [J].
Allers, J ;
Shamoo, Y .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 311 (01) :75-86
[5]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[6]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[7]   Support vector machines for predicting membrane protein types by using functional domain composition [J].
Cai, YD ;
Zhou, GP ;
Chou, KC .
BIOPHYSICAL JOURNAL, 2003, 84 (05) :3257-3263
[8]   Prediction of linear B-cell epitopes using amino acid pair antigenicity scale [J].
Chen, J. ;
Liu, H. ;
Yang, J. ;
Chou, K.-C. .
AMINO ACIDS, 2007, 33 (03) :423-428
[9]   Coupling interaction between thromboxane A2 receptor and alpha-13 subunit of guanine nucleotide-binding protein [J].
Chou, KC .
JOURNAL OF PROTEOME RESEARCH, 2005, 4 (05) :1681-1686
[10]   Insights from modeling the 3D structure of DNA-CBF3b complex [J].
Chou, KC .
JOURNAL OF PROTEOME RESEARCH, 2005, 4 (05) :1657-1660