Functional classification of long non-coding RNAs by k-mer content

被引:163
作者
Kirk, Jessime M. [1 ,2 ,3 ]
Kim, Susan O. [1 ,2 ,9 ]
Inoue, Kaoru [1 ,2 ,9 ]
Smola, Matthew J. [4 ,10 ]
Lee, David M. [1 ,2 ,5 ]
Schertzer, Megan D. [1 ,2 ,5 ]
Wooten, Joshua S. [1 ,2 ,5 ]
Baker, Allison R. [1 ,2 ,11 ]
Sprague, Daniel [1 ,2 ,6 ]
Collins, David W. [7 ]
Horning, Christopher R. [7 ]
Wang, Shuo [7 ]
Chen, Qidi [7 ]
Weeks, Kevin M. [4 ]
Mucha, Peter J. [8 ]
Calabrese, J. Mauro [1 ,2 ]
机构
[1] Univ North Carolina Chapel Hill, Dept Pharmacol, Chapel Hill, NC 27599 USA
[2] Univ North Carolina Chapel Hill, Lineberger Comprehens Canc Ctr, Chapel Hill, NC 27599 USA
[3] Univ North Carolina Chapel Hill, Curriculum Bioinformat & Computat Biol, Chapel Hill, NC USA
[4] Univ North Carolina Chapel Hill, Dept Chem, Chapel Hill, NC USA
[5] Univ North Carolina Chapel Hill, Curriculum Genet & Mol Biol, Chapel Hill, NC USA
[6] Univ North Carolina Chapel Hill, Curriculum Pharmacol, Chapel Hill, NC USA
[7] Univ North Carolina Chapel Hill, Dept Comp Sci, Chapel Hill, NC USA
[8] Univ North Carolina Chapel Hill, Carolina Ctr Interdisciplinary Appl Math, Dept Math, Chapel Hill, NC USA
[9] NIEHS, POB 12233, Res Triangle Pk, NC 27709 USA
[10] Ribometrix, Durham, NC USA
[11] Harvard Med Sch, PhD Program Biol & Biomed Sci, Boston, MA USA
基金
美国国家卫生研究院;
关键词
INACTIVE X-CHROMOSOME; EVOLUTION; SEQUENCE; SECONDARY; DOMAINS; LNCRNA; TRANSCRIPTOMES; VISUALIZATION; PROTEINS; GENOMICS;
D O I
10.1038/s41588-018-0207-8
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The functions of most long non-coding RNAs (lncRNAs) are unknown. In contrast to proteins, lncRNAs with similar functions often lack linear sequence homology; thus, the identification of function in one lncRNA rarely informs the identification of function in others. We developed a sequence comparison method to deconstruct linear sequence relationships in lncRNAs and evaluate similarity based on the abundance of short motifs called k-mers. We found that lncRNAs of related function often had similar k-mer profiles despite lacking linear homology, and that k-mer profiles correlated with protein binding to lncRNAs and with their subcellular localization. Using a novel assay to quantify Xist-like regulatory potential, we directly demonstrated that evolutionarily unrelated lncRNAs can encode similar function through different spatial arrangements of related sequence motifs. K-mer-based classification is a powerful approach to detect recurrent relationships between sequence and function in lncRNAs.
引用
收藏
页码:1474 / +
页数:12
相关论文
共 58 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   MEME SUITE: tools for motif discovery and searching [J].
Bailey, Timothy L. ;
Boden, Mikael ;
Buske, Fabian A. ;
Frith, Martin ;
Grant, Charles E. ;
Clementi, Luca ;
Ren, Jingyuan ;
Li, Wilfred W. ;
Noble, William S. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W202-W208
[3]   UniProt: a hub for protein information [J].
Bateman, Alex ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Apweiler, Rolf ;
Alpi, Emanuele ;
Antunes, Ricardo ;
Arganiska, Joanna ;
Bely, Benoit ;
Bingley, Mark ;
Bonilla, Carlos ;
Britto, Ramona ;
Bursteinas, Borisas ;
Chavali, Gayatri ;
Cibrian-Uhalte, Elena ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dogan, Tunca ;
Fazzini, Francesco ;
Gane, Paul ;
Cas-tro, Leyla Garcia ;
Garmiri, Penelope ;
Hatton-Ellis, Emma ;
Hieta, Reija ;
Huntley, Rachael ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
MacDougall, Alistair ;
Mutowo, Prudence ;
Nightin-gale, Andrew ;
Orchard, Sandra ;
Pichler, Klemens ;
Poggioli, Diego ;
Pundir, Sangya ;
Pureza, Luis ;
Qi, Guoying ;
Rosanoff, Steven ;
Saidi, Rabie ;
Sawford, Tony ;
Shypitsyna, Aleksandra ;
Turner, Edward ;
Volynkin, Vladimir ;
Wardell, Tony ;
Watkins, Xavier ;
Zellner, Hermann ;
Cowley, Andrew ;
Figueira, Luis ;
Li, Weizhong ;
McWilliam, Hamish .
NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) :D204-D212
[4]   Announcing the worldwide Protein Data Bank [J].
Berman, H ;
Henrick, K ;
Nakamura, H .
NATURE STRUCTURAL BIOLOGY, 2003, 10 (12) :980-980
[5]   Fast unfolding of communities in large networks [J].
Blondel, Vincent D. ;
Guillaume, Jean-Loup ;
Lambiotte, Renaud ;
Lefebvre, Etienne .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,
[6]   Visualization of RNA structure models within the Integrative Genomics Viewer [J].
Busan, Steven ;
Weeks, Kevin M. .
RNA, 2017, 23 (07) :1012-1018
[7]   Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses [J].
Cabili, Moran N. ;
Trapnell, Cole ;
Goff, Loyal ;
Koziol, Magdalena ;
Tazon-Vega, Barbara ;
Regev, Aviv ;
Rinn, John L. .
GENES & DEVELOPMENT, 2011, 25 (18) :1915-1927
[8]   Site-Specific Silencing of Regulatory Elements as a Mechanism of X Inactivation [J].
Calabrese, J. Mauro ;
Sun, Wei ;
Song, Lingyun ;
Mugford, Joshua W. ;
Williams, Lucy ;
Yee, Della ;
Starmer, Joshua ;
Mieczkowski, Piotr ;
Crawford, Gregory E. ;
Magnuson, Terry .
CELL, 2012, 151 (05) :951-963
[9]   Cytoplasmic long noncoding RNAs are frequently bound to and degraded at ribosomes in human cells [J].
Carlevaro-Fita, Joana ;
Rahim, Anisa ;
Guigo, Roderic ;
Vardy, Leah A. ;
Johnson, Rory .
RNA, 2016, 22 (06) :867-882
[10]   The Noncoding RNA Revolution-Trashing Old Rules to Forge New Ones [J].
Cech, Thomas R. ;
Steitz, Joan A. .
CELL, 2014, 157 (01) :77-94