Accurate prediction of RNA-binding protein residues with two discriminative structural descriptors

被引:25
作者
Sun, Meijian [1 ]
Wang, Xia [1 ]
Zou, Chuanxin [1 ]
He, Zenghui [1 ]
Liu, Wei [1 ]
Li, Honglin [1 ]
机构
[1] E China Univ Sci & Technol, Sch Pharm, Shanghai Key Lab New Drug Design, State Key Lab Bioreactor Engn, 130 Mei Long Rd, Shanghai 200237, Peoples R China
来源
BMC BIOINFORMATICS | 2016年 / 17卷
基金
中国国家自然科学基金;
关键词
Protein-RNA interactions; Residue triplet interface propensity; Residue electrostatic surface potential; Random forest classifier; Structural analysis; SECONDARY STRUCTURE; STRUCTURE ALIGNMENT; SITES; SEQUENCE; RECOGNITION; DNA; INFORMATION; INTERFACE; DATABASE; CLASSIFICATION;
D O I
10.1186/s12859-016-1110-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: RNA-binding proteins participate in many important biological processes concerning RNA-mediated gene regulation, and several computational methods have been recently developed to predict the protein-RNA interactions of RNA-binding proteins. Newly developed discriminative descriptors will help to improve the prediction accuracy of these prediction methods and provide further meaningful information for researchers. Results: In this work, we designed two structural features (residue electrostatic surface potential and triplet interface propensity) and according to the statistical and structural analysis of protein-RNA complexes, the two features were powerful for identifying RNA-binding protein residues. Using these two features and other excellent structure-and sequence-based features, a random forest classifier was constructed to predict RNA-binding residues. The area under the receiver operating characteristic curve (AUC) of five-fold cross-validation for our method on training set RBP195 was 0.900, and when applied to the test set RBP68, the prediction accuracy (ACC) was 0.868, and the F-score was 0.631. Conclusions: The good prediction performance of our method revealed that the two newly designed descriptors could be discriminative for inferring protein residues interacting with RNAs. To facilitate the use of our method, a web-server called RNAProSite, which implements the proposed method, was constructed and is freely available at http://lilab.ecust.edu.cn/NABind.
引用
收藏
页数:14
相关论文
共 67 条
[1]   Qgrid: clustering tool for detecting charged and hydrophobic regions in proteins [J].
Ahmad, S ;
Sarai, A .
NUCLEIC ACIDS RESEARCH, 2004, 32 :W104-W107
[2]   Structure-based analysis of Protein-RNA interactions using the program ENTANGLE [J].
Allers, J ;
Shamoo, Y .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 311 (01) :75-86
[3]   Multi-disciplinary methods to define RNA-protein interactions and regulatory networks [J].
Ascano, Manuel ;
Gerstberger, Stefanie ;
Tuschl, Thomas .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 2013, 23 (01) :20-28
[4]   Dissecting protein-RNA recognition sites [J].
Bahadur, Ranjit Prasad ;
Zacharias, Martin ;
Janin, Joel .
NUCLEIC ACIDS RESEARCH, 2008, 36 (08) :2705-2716
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   NAPS: a residue-level nucleic acid-binding prediction server [J].
Carson, Matthew B. ;
Langlois, Robert ;
Lu, Hui .
NUCLEIC ACIDS RESEARCH, 2010, 38 :W431-W435
[7]   An automated growth enclosure for metabolic labeling of Arabidopsis thaliana with 13C-carbon dioxide - an in vivo labeling system for proteomics and metabolomics research [J].
Chen, Wen-Ping ;
Yang, Xiao-Yuan ;
Harms, Geoffrey L. ;
Gray, William M. ;
Hegeman, Adrian D. ;
Cohen, Jerry D. .
PROTEOME SCIENCE, 2011, 9
[8]   Identifying RNA-binding residues based on evolutionary conserved structural and energetic features [J].
Chen, Yao Chi ;
Sargsyan, Karen ;
Wright, Jon D. ;
Huang, Yi-Shuian ;
Lim, Carmay .
NUCLEIC ACIDS RESEARCH, 2014, 42 (03) :e15
[9]  
Chen YC, 2008, NUCLEIC ACIDS RES, V36, P5
[10]   Predicting RNA-binding sites of proteins using support vector machines and evolutionary information [J].
Cheng, Cheng-Wei ;
Su, Emily Chia-Yu ;
Hwang, Jenn-Kang ;
Sung, Ting-Yi ;
Hsu, Wen-Lian .
BMC BIOINFORMATICS, 2008, 9