RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders either take the structure of the RNA only partially into account, or employ models which are not directly interpretable as sequence-structure motifs. We developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and Gibbs sampling which fully captures the relationship between RNA sequence and secondary structure preference of a given RBP. Compared to previous methods which output separate logos for sequence and structure, it directly produces a combined sequence-structure motif when trained on a large set of sequences. ssHMM's model is visualized intuitively as a graph and facilitates biological interpretation. ssHMM can be used to find novel bona fide sequence-structure motifs of uncharacterized RBPs, such as the one presented here for the YY1 protein. ssHMM reaches a high motif recovery rate on synthetic data, it recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input size, being considerably faster than MEMERIS and RNAcontext on large datasets while being on par with GraphProt. It is freely available on Github and as a Docker image.
机构:
Whitehead Inst Biomed Res, Cambridge, MA 02142 USA
Howard Hughes Med Inst, Chevy Chase, MD 20815 USA
MIT, Dept Biol, Cambridge, MA 02139 USA
Harvard MIT Div Hlth Sci & Technol, Cambridge, MA 02139 USAWhitehead Inst Biomed Res, Cambridge, MA 02142 USA
Auyeung, Vincent C.
;
Ulitsky, Igor
论文数: 0引用数: 0
h-index: 0
机构:
Whitehead Inst Biomed Res, Cambridge, MA 02142 USA
Howard Hughes Med Inst, Chevy Chase, MD 20815 USA
MIT, Dept Biol, Cambridge, MA 02139 USAWhitehead Inst Biomed Res, Cambridge, MA 02142 USA
Ulitsky, Igor
;
McGeary, Sean E.
论文数: 0引用数: 0
h-index: 0
机构:
Whitehead Inst Biomed Res, Cambridge, MA 02142 USA
Howard Hughes Med Inst, Chevy Chase, MD 20815 USA
MIT, Dept Biol, Cambridge, MA 02139 USAWhitehead Inst Biomed Res, Cambridge, MA 02142 USA
McGeary, Sean E.
;
Bartel, David P.
论文数: 0引用数: 0
h-index: 0
机构:
Whitehead Inst Biomed Res, Cambridge, MA 02142 USA
Howard Hughes Med Inst, Chevy Chase, MD 20815 USA
MIT, Dept Biol, Cambridge, MA 02139 USAWhitehead Inst Biomed Res, Cambridge, MA 02142 USA
机构:
Univ So Calif, Los Angeles, CA 90089 USAUniv So Calif, Los Angeles, CA 90089 USA
Bahrami-Samani, Emad
;
Penalva, Luiz O. F.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Texas Hlth Sci Ctr San Antonio, Childrens Canc Res Inst, San Antonio, TX 78229 USA
Univ Texas Hlth Sci Ctr San Antonio, Dept Cellular & Struct Biol, San Antonio, TX 78229 USAUniv So Calif, Los Angeles, CA 90089 USA
Penalva, Luiz O. F.
;
Smith, Andrew D.
论文数: 0引用数: 0
h-index: 0
机构:
Univ So Calif, Los Angeles, CA 90089 USAUniv So Calif, Los Angeles, CA 90089 USA
Smith, Andrew D.
;
Uren, Philip J.
论文数: 0引用数: 0
h-index: 0
机构:
Univ So Calif, Los Angeles, CA 90089 USAUniv So Calif, Los Angeles, CA 90089 USA
机构:
McGill Univ, Lady Davis Inst Med Res, Bloodfield Ctr Res Aging, Dept Oncol & Med,Terry Fox Mol Oncol Grp, Montreal, PQ, CanadaMcGill Univ, Lady Davis Inst Med Res, Bloodfield Ctr Res Aging, Dept Oncol & Med,Terry Fox Mol Oncol Grp, Montreal, PQ, Canada
Chenard, Carol Anne
;
Richard, Stephane
论文数: 0引用数: 0
h-index: 0
机构:
McGill Univ, Lady Davis Inst Med Res, Bloodfield Ctr Res Aging, Dept Oncol & Med,Terry Fox Mol Oncol Grp, Montreal, PQ, CanadaMcGill Univ, Lady Davis Inst Med Res, Bloodfield Ctr Res Aging, Dept Oncol & Med,Terry Fox Mol Oncol Grp, Montreal, PQ, Canada
机构:
Whitehead Inst Biomed Res, Cambridge, MA 02142 USA
Howard Hughes Med Inst, Chevy Chase, MD 20815 USA
MIT, Dept Biol, Cambridge, MA 02139 USA
Harvard MIT Div Hlth Sci & Technol, Cambridge, MA 02139 USAWhitehead Inst Biomed Res, Cambridge, MA 02142 USA
Auyeung, Vincent C.
;
Ulitsky, Igor
论文数: 0引用数: 0
h-index: 0
机构:
Whitehead Inst Biomed Res, Cambridge, MA 02142 USA
Howard Hughes Med Inst, Chevy Chase, MD 20815 USA
MIT, Dept Biol, Cambridge, MA 02139 USAWhitehead Inst Biomed Res, Cambridge, MA 02142 USA
Ulitsky, Igor
;
McGeary, Sean E.
论文数: 0引用数: 0
h-index: 0
机构:
Whitehead Inst Biomed Res, Cambridge, MA 02142 USA
Howard Hughes Med Inst, Chevy Chase, MD 20815 USA
MIT, Dept Biol, Cambridge, MA 02139 USAWhitehead Inst Biomed Res, Cambridge, MA 02142 USA
McGeary, Sean E.
;
Bartel, David P.
论文数: 0引用数: 0
h-index: 0
机构:
Whitehead Inst Biomed Res, Cambridge, MA 02142 USA
Howard Hughes Med Inst, Chevy Chase, MD 20815 USA
MIT, Dept Biol, Cambridge, MA 02139 USAWhitehead Inst Biomed Res, Cambridge, MA 02142 USA
机构:
Univ So Calif, Los Angeles, CA 90089 USAUniv So Calif, Los Angeles, CA 90089 USA
Bahrami-Samani, Emad
;
Penalva, Luiz O. F.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Texas Hlth Sci Ctr San Antonio, Childrens Canc Res Inst, San Antonio, TX 78229 USA
Univ Texas Hlth Sci Ctr San Antonio, Dept Cellular & Struct Biol, San Antonio, TX 78229 USAUniv So Calif, Los Angeles, CA 90089 USA
Penalva, Luiz O. F.
;
Smith, Andrew D.
论文数: 0引用数: 0
h-index: 0
机构:
Univ So Calif, Los Angeles, CA 90089 USAUniv So Calif, Los Angeles, CA 90089 USA
Smith, Andrew D.
;
Uren, Philip J.
论文数: 0引用数: 0
h-index: 0
机构:
Univ So Calif, Los Angeles, CA 90089 USAUniv So Calif, Los Angeles, CA 90089 USA
机构:
McGill Univ, Lady Davis Inst Med Res, Bloodfield Ctr Res Aging, Dept Oncol & Med,Terry Fox Mol Oncol Grp, Montreal, PQ, CanadaMcGill Univ, Lady Davis Inst Med Res, Bloodfield Ctr Res Aging, Dept Oncol & Med,Terry Fox Mol Oncol Grp, Montreal, PQ, Canada
Chenard, Carol Anne
;
Richard, Stephane
论文数: 0引用数: 0
h-index: 0
机构:
McGill Univ, Lady Davis Inst Med Res, Bloodfield Ctr Res Aging, Dept Oncol & Med,Terry Fox Mol Oncol Grp, Montreal, PQ, CanadaMcGill Univ, Lady Davis Inst Med Res, Bloodfield Ctr Res Aging, Dept Oncol & Med,Terry Fox Mol Oncol Grp, Montreal, PQ, Canada