CoRAL: predicting non-coding RNAs from small RNA-sequencing data

被引:20
|
作者
Leung, Yuk Yee [1 ,2 ]
Ryvkin, Paul [2 ,3 ]
Ungar, Lyle H. [2 ,3 ,4 ]
Gregory, Brian D. [3 ,5 ,6 ]
Wang, Li-San [1 ,2 ,3 ,5 ,7 ]
机构
[1] Univ Penn, Perelman Sch Med, Dept Pathol & Lab Med, Philadelphia, PA 19104 USA
[2] Univ Penn, Perelman Sch Med, Penn Ctr Bioinformat, Philadelphia, PA 19104 USA
[3] Univ Penn, Perelman Sch Med, Genom & Computat Biol Grad Grp, Philadelphia, PA 19104 USA
[4] Univ Penn, Dept Comp & Informat Sci, Philadelphia, PA 19104 USA
[5] Univ Penn, Perelman Sch Med, Penn Genome Frontiers Inst, Philadelphia, PA 19104 USA
[6] Univ Penn, Dept Biol, Philadelphia, PA 19104 USA
[7] Univ Penn, Perelman Sch Med, Inst Aging, Philadelphia, PA 19104 USA
基金
美国国家科学基金会;
关键词
INTEGRATIVE ANNOTATION; REVEALS; CLASSIFICATION; EXPRESSION; MICRORNAS; GENES;
D O I
10.1093/nar/gkt426
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The surprising observation that virtually the entire human genome is transcribed means we know little about the function of many emerging classes of RNAs, except their astounding diversities. Traditional RNA function prediction methods rely on sequence or alignment information, which are limited in their abilities to classify the various collections of non-coding RNAs (ncRNAs). To address this, we developed Classification of RNAs by Analysis of Length (CoRAL), a machine learning-based approach for classification of RNA molecules. CoRAL uses biologically interpretable features including fragment length and cleavage specificity to distinguish between different ncRNA populations. We evaluated CoRAL using genome-wide small RNA sequencing data sets from four human tissue types and were able to classify six different types of RNAs with similar to 80% cross-validation accuracy. Analysis by CoRAL revealed that microRNAs, small nucleolar and transposon-derived RNAs are highly discernible and consistent across all human tissue types assessed, whereas long intergenic ncRNAs, small cytoplasmic RNAs and small nuclear RNAs show less consistent patterns. The ability to reliably annotate loci across tissue types demonstrates the potential of CoRAL to characterize ncRNAs using small RNA sequencing data in less well-characterized organisms.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Predicting long non-coding RNAs using RNA sequencing
    Ilott, Nicholas E.
    Ponting, Chris P.
    METHODS, 2013, 63 (01) : 50 - 59
  • [2] Screening key long non-coding RNAs in early-stage colon adenocarcinoma by RNA-sequencing
    Liu, Ji-Xi
    Li, Wen
    Li, Jing-Tao
    Liu, Fang
    Zhou, Lei
    EPIGENOMICS, 2018, 10 (09) : 1215 - 1228
  • [3] Small Non-Coding RNAs Derived from Eukaryotic Ribosomal RNA
    Lambert, Marine
    Benmoussa, Abderrahim
    Provost, Patrick
    NON-CODING RNA, 2019, 5 (01)
  • [4] Hybridization-based reconstruction of small non-coding RNA transcripts from deep sequencing data
    Ragan, Chikako
    Mowry, Bryan J.
    Bauer, Denis C.
    NUCLEIC ACIDS RESEARCH, 2012, 40 (16) : 7633 - 7643
  • [5] Profiling of Long Non-coding RNAs and mRNAs by RNA-Sequencing in the Hippocampi of Adult Mice Following Propofol Sedation
    Fan, Jun
    Zhou, Quan
    Li, Yan
    Song, Xiuling
    Hu, Jijie
    Qin, Zaisheng
    Tang, Jing
    Tao, Tao
    FRONTIERS IN MOLECULAR NEUROSCIENCE, 2018, 11
  • [6] Deep sequencing of small RNA transcriptome reveals novel non-coding RNAs in hepatocellular carcinoma
    Law, Priscilla T. -Y.
    Qin, Hao
    Ching, Arthur K. -K.
    Lai, Keng Po
    Co, Ngai Na
    He, Mian
    Lung, Raymond W. -M.
    Chan, Anthony W. -H.
    Chan, Ting-Fung
    Wong, Nathalie
    JOURNAL OF HEPATOLOGY, 2013, 58 (06) : 1165 - 1173
  • [7] Small RNAs derived from structural non-coding RNAs
    Chen, Chong-Jian
    Heard, Edith
    METHODS, 2013, 63 (01) : 76 - 84
  • [8] Cloning and characterization of small non-coding RNAs from grape
    Carra, Andrea
    Mica, Erica
    Gambino, Giorgio
    Pindo, Massimo
    Moser, Claudio
    Pe, Mario Enrico
    Schubert, Andrea
    PLANT JOURNAL, 2009, 59 (05) : 750 - 763
  • [9] A bioinformatics approach to identify novel long, non-coding RNAs in breast cancer cell lines from an existing RNA-sequencing dataset
    Zaheed, Oza
    Samson, Julia
    Dean, Kellie
    NON-CODING RNA RESEARCH, 2020, 5 (02): : 48 - 59
  • [10] Non-coding small RNAs and spermatogenesis
    Romero, Yannick
    Calvel, Pierre
    Nef, Serge
    M S-MEDECINE SCIENCES, 2012, 28 (05): : 490 - 496