beRBP: binding estimation for human RNA-binding proteins

被引:30
作者
Yu, Hui [1 ]
Wang, Jing [1 ,2 ]
Sheng, Quanhu [1 ,2 ]
Liu, Qi [1 ,2 ]
Shyr, Yu [1 ,2 ]
机构
[1] Vanderbilt Univ, Ctr Quantitat Sci, Med Ctr, Nashville, TN 37232 USA
[2] Vanderbilt Univ, Dept Biostat, Med Ctr, Nashville, TN 37203 USA
关键词
TRANSCRIPTOME-WIDE IDENTIFICATION; WEB SERVER; SITES; PREDICTION; DATABASE; SPECIFICITIES; ACCESSIBILITY; TARGETS;
D O I
10.1093/nar/gky1294
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Identifying binding targets of RNA-binding proteins (RBPs) can greatly facilitate our understanding of their functional mechanisms. Most computational methods employ machine learning to train classifiers on either RBP-specific targets or pooled RBP-RNA interactions. The former strategy is more powerful, but it only applies to a few RBPs with a large number of known targets; conversely, the latter strategy sacrifices prediction accuracy for a wider application, since specific interaction features are inevitably obscured through pooling heterogeneous datasets. Here, we present beRBP, a dual approach to predict human RBP-RNA interaction given PWM of a RBP and one RNA sequence. Based on Random Forests, beRBP not only builds a specific model for each RBP with a decent number of known targets, but also develops a general model for RBPs with limited or null known targets. The specific and general models both compared well with existing methods on three benchmark datasets. Notably, the general model achieved a better performance than existing methods on most novel RBPs. Overall, as a composite solution overarching the RBP-specific and RBP-General strategies, beRBP is a promising tool for human RBP binding estimation with good prediction accuracy and a broad application scope.
引用
收藏
页数:10
相关论文
共 37 条
  • [1] Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning
    Alipanahi, Babak
    Delong, Andrew
    Weirauch, Matthew T.
    Frey, Brendan J.
    [J]. NATURE BIOTECHNOLOGY, 2015, 33 (08) : 831 - +
  • [2] DoRiNA 2.0-upgrading the doRiNA database of RNA interactions in post-transcriptional regulation
    Blin, Kai
    Dieterich, Christoph
    Wurmus, Ricardo
    Rajewsky, Nikolaus
    Landthaler, Markus
    Akalin, Altuna
    [J]. NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) : D160 - D167
  • [3] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [4] CONSERVED STRUCTURES AND DIVERSITY OF FUNCTIONS OF RNA-BINDING PROTEINS
    BURD, CG
    DREYFUSS, G
    [J]. SCIENCE, 1994, 265 (5172) : 615 - 621
  • [5] Estimating generalization error on two-class datasets using out-of-bag estimates
    Bylander, T
    [J]. MACHINE LEARNING, 2002, 48 (1-3) : 287 - 297
  • [6] High-throughput characterization of protein-RNA interactions
    Cook, Kate B.
    Hughes, Timothy R.
    Morris, Quaid D.
    [J]. BRIEFINGS IN FUNCTIONAL GENOMICS, 2015, 14 (01) : 74 - 89
  • [7] RBPDB: a database of RNA-binding specificities
    Cook, Kate B.
    Kazan, Hilal
    Zuberi, Khalid
    Morris, Quaid
    Hughes, Timothy R.
    [J]. NUCLEIC ACIDS RESEARCH, 2011, 39 : D301 - D308
  • [8] AURA 2 Empowering discovery of post-transcriptional networks
    Dassi, Erik
    Re, Angela
    Leo, Sara
    Tebaldi, Toma
    Pasini, Luigi
    Peroni, Daniele
    Quattrone, Alessandro
    [J]. TRANSLATION, 2014, 2 (01)
  • [9] INVITRO SELECTION OF RNA MOLECULES THAT BIND SPECIFIC LIGANDS
    ELLINGTON, AD
    SZOSTAK, JW
    [J]. NATURE, 1990, 346 (6287) : 818 - 822
  • [10] A census of human RNA-binding proteins
    Gerstberger, Stefanie
    Hafner, Markus
    Tuschl, Thomas
    [J]. NATURE REVIEWS GENETICS, 2014, 15 (12) : 829 - 845