CoRAL: predicting non-coding RNAs from small RNA-sequencing data

被引:20
作者
Leung, Yuk Yee [1 ,2 ]
Ryvkin, Paul [2 ,3 ]
Ungar, Lyle H. [2 ,3 ,4 ]
Gregory, Brian D. [3 ,5 ,6 ]
Wang, Li-San [1 ,2 ,3 ,5 ,7 ]
机构
[1] Univ Penn, Perelman Sch Med, Dept Pathol & Lab Med, Philadelphia, PA 19104 USA
[2] Univ Penn, Perelman Sch Med, Penn Ctr Bioinformat, Philadelphia, PA 19104 USA
[3] Univ Penn, Perelman Sch Med, Genom & Computat Biol Grad Grp, Philadelphia, PA 19104 USA
[4] Univ Penn, Dept Comp & Informat Sci, Philadelphia, PA 19104 USA
[5] Univ Penn, Perelman Sch Med, Penn Genome Frontiers Inst, Philadelphia, PA 19104 USA
[6] Univ Penn, Dept Biol, Philadelphia, PA 19104 USA
[7] Univ Penn, Perelman Sch Med, Inst Aging, Philadelphia, PA 19104 USA
基金
美国国家科学基金会;
关键词
INTEGRATIVE ANNOTATION; REVEALS; CLASSIFICATION; EXPRESSION; MICRORNAS; GENES;
D O I
10.1093/nar/gkt426
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The surprising observation that virtually the entire human genome is transcribed means we know little about the function of many emerging classes of RNAs, except their astounding diversities. Traditional RNA function prediction methods rely on sequence or alignment information, which are limited in their abilities to classify the various collections of non-coding RNAs (ncRNAs). To address this, we developed Classification of RNAs by Analysis of Length (CoRAL), a machine learning-based approach for classification of RNA molecules. CoRAL uses biologically interpretable features including fragment length and cleavage specificity to distinguish between different ncRNA populations. We evaluated CoRAL using genome-wide small RNA sequencing data sets from four human tissue types and were able to classify six different types of RNAs with similar to 80% cross-validation accuracy. Analysis by CoRAL revealed that microRNAs, small nucleolar and transposon-derived RNAs are highly discernible and consistent across all human tissue types assessed, whereas long intergenic ncRNAs, small cytoplasmic RNAs and small nuclear RNAs show less consistent patterns. The ability to reliably annotate loci across tissue types demonstrates the potential of CoRAL to characterize ncRNAs using small RNA sequencing data in less well-characterized organisms.
引用
收藏
页数:10
相关论文
共 29 条
[1]   MicroRNAs: Genomics, biogenesis, mechanism, and function (Reprinted from Cell, vol 116, pg 281-297, 2004) [J].
Bartel, David P. .
CELL, 2007, 131 (04) :11-29
[2]   U2 AS WELL AS U1 SMALL NUCLEAR RIBONUCLEOPROTEINS ARE INVOLVED IN PRE-MESSENGER RNA SPLICING [J].
BLACK, DL ;
CHABOT, B ;
STEITZ, JA .
CELL, 1985, 42 (03) :737-750
[3]   NONCODE v3.0: integrative annotation of long noncoding RNAs [J].
Bu, Dechao ;
Yu, Kuntao ;
Sun, Silong ;
Xie, Chaoyong ;
Skogerbo, Geir ;
Miao, Ruoyu ;
Xiao, Hui ;
Liao, Qi ;
Luo, Haitao ;
Zhao, Guoguang ;
Zhao, Haitao ;
Liu, Zhiyong ;
Liu, Changning ;
Chen, Runsheng ;
Zhao, Yi .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D210-D215
[4]   Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses [J].
Cabili, Moran N. ;
Trapnell, Cole ;
Goff, Loyal ;
Koziol, Magdalena ;
Tazon-Vega, Barbara ;
Regev, Aviv ;
Rinn, John L. .
GENES & DEVELOPMENT, 2011, 25 (18) :1915-1927
[5]   Multifaceted mammalian transcriptome [J].
Carninci, Piero ;
Yasuda, Jun ;
Hayashizaki, Yoshihide .
CURRENT OPINION IN CELL BIOLOGY, 2008, 20 (03) :274-280
[6]   Gene selection and classification of microarray data using random forest -: art. no. 3 [J].
Díaz-Uriarte, R ;
de Andrés, SA .
BMC BIOINFORMATICS, 2006, 7 (1)
[7]   Meta-analysis of small RNA-sequencing errors reveals ubiquitous post-transcriptional RNA modifications [J].
Ebhardt, H. Alexander ;
Tsang, Herbert H. ;
Dai, Denny C. ;
Liu, Yifeng ;
Bostan, Babak ;
Fahlman, Richard P. .
NUCLEIC ACIDS RESEARCH, 2009, 37 (08) :2461-2470
[8]   Non-coding RNA genes and the modern RNA world [J].
Eddy, SR .
NATURE REVIEWS GENETICS, 2001, 2 (12) :919-929
[9]   Classification of ncRNAs using position and size information in deep sequencing data [J].
Erhard, Florian ;
Zimmer, Ralf .
BIOINFORMATICS, 2010, 26 (18) :i426-i432
[10]   Evidence for natural antisense transcript-mediated inhibition of microRNA function [J].
Faghihi, Mohammad Ali ;
Zhang, Ming ;
Huang, Jia ;
Modarresi, Farzaneh ;
Van der Brug, Marcel P. ;
Nalls, Michael A. ;
Cookson, Mark R. ;
St-Laurent, Georges, III ;
Wahlestedt, Claes .
GENOME BIOLOGY, 2010, 11 (05)