RNA-binding proteins (RBPs) play important roles in the control of gene expression and the coordination of different layers of post-transcriptional regulation. Interactions between certain RBPs and mRNA transcripts are notoriously difficult to predict, as any given protein-RNA interaction may rely not only on RNA sequence, but also on three-dimensional RNA structures, competitive inhibition from other RBPs, and input from cellular signaling pathways. Advanced and high-throughput technologies for the identification of RNA-protein interactions have come to the rescue, but the identification of binding sites and downstream functional effects of RBPs from the resulting data can be challenging. In this review, we discuss statistical inference and machine-learning approaches and tools relevant for the study of RBPs and the analysis of large-scale RNA-protein interaction datasets. This primer is intended for life scientists who are interested in incorporating these tools into their own research. We begin with the demystification of regression models, as used in the analysis of next-generation sequencing data, and progress to a discussion of Hidden Markov Models, which are of particular value in analyzing cross-linking followed by immunoprecipitation data. We then continue with examples of machine learning techniques, such as support vector machines and gradient tree boosting. We close with a brief discussion of current trends in the field, including deep learning architectures.
机构:
Univ Massachusetts, Sch Med, Howard Hughes Med Inst, Worcester, MA 01605 USAUniv Massachusetts, Sch Med, Howard Hughes Med Inst, Worcester, MA 01605 USA
Serebrov, Victor
Moore, Melissa J.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Massachusetts, Sch Med, Howard Hughes Med Inst, Worcester, MA 01605 USAUniv Massachusetts, Sch Med, Howard Hughes Med Inst, Worcester, MA 01605 USA
机构:
Yale Univ, Dept Mol Biophys & Biochem, POB 6666, New Haven, CT 06520 USA
Mem Sloan Kettering Canc Ctr, Dept Dev Biol, 1275 York Ave, New York, NY 10021 USAYale Univ, Dept Mol Biophys & Biochem, POB 6666, New Haven, CT 06520 USA
Despic, Vladimir
Dejung, Mario
论文数: 0引用数: 0
h-index: 0
机构:
Inst Mol Biol, Mainz, GermanyYale Univ, Dept Mol Biophys & Biochem, POB 6666, New Haven, CT 06520 USA
Dejung, Mario
论文数: 引用数:
h-index:
机构:
Butter, Falk
Neugebauer, Karla M.
论文数: 0引用数: 0
h-index: 0
机构:
Yale Univ, Dept Mol Biophys & Biochem, POB 6666, New Haven, CT 06520 USAYale Univ, Dept Mol Biophys & Biochem, POB 6666, New Haven, CT 06520 USA
机构:
Washington Univ, Sch Med, Dept Biochem & Mol Biophys, St Louis, MO 63110 USAWashington Univ, Sch Med, Dept Biochem & Mol Biophys, St Louis, MO 63110 USA