Prediction of novel long non-coding RNAs based on RNA-Seq data of mouse Klf1 knockout study

被引:107
作者
Sun, Lei [1 ,2 ,3 ,4 ]
Zhang, Zhihua [2 ,3 ]
Bailey, Timothy L. [4 ]
Perkins, Andrew C. [5 ]
Tallack, Michael R. [5 ]
Xu, Zhao [1 ]
Liu, Hui [1 ]
机构
[1] China Univ Min & Technol, Sch Informat & Elect Engn, Xuzhou 221008, Jiangsu, Peoples R China
[2] Chinese Acad Sci, Beijing Inst Genom, Ctr Computat Biol, Beijing 100029, Peoples R China
[3] Chinese Acad Sci, Beijing Inst Genom, Lab Dis Genom & Personalized Med, Beijing 100029, Peoples R China
[4] Univ Queensland, Inst Mol Biosci, Brisbane, Qld 4072, Australia
[5] Mater Hosp, Mater Med Res Inst, Brisbane, Qld 4101, Australia
基金
中国博士后科学基金;
关键词
INTEGRATIVE ANNOTATION; TRANSCRIPTION FACTOR; GENOME; SEQUENCES; REVEALS; QUANTIFICATION; IDENTIFICATION; PLURIPOTENCY; TOPHAT; GALAXY;
D O I
10.1186/1471-2105-13-331
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Study on long non-coding RNAs (lncRNAs) has been promoted by high-throughput RNA sequencing (RNA-Seq). However, it is still not trivial to identify lncRNAs from the RNA-Seq data and it remains a challenge to uncover their functions. Results: We present a computational pipeline for detecting novel lncRNAs from the RNA-Seq data. First, the genome-guided transcriptome reconstruction is used to generate initially assembled transcripts. The possible partial transcripts and artefacts are filtered according to the quantified expression level. After that, novel lncRNAs are detected by further filtering known transcripts and those with high protein coding potential, using a newly developed program called lncRScan. We applied our pipeline to a mouse Klf1 knockout dataset, and discussed the plausible functions of the novel lncRNAs we detected by differential expression analysis. We identified 308 novel lncRNA candidates, which have shorter transcript length, fewer exons, shorter putative open reading frame, compared with known protein-coding transcripts. Of the lncRNAs, 52 large intergenic ncRNAs (lincRNAs) show lower expression level than the protein-coding ones and 13 lncRNAs represent significant differential expression between the wild-type and Klf1 knockout conditions. Conclusions: Our method can predict a set of novel lncRNAs from the RNA-Seq data. Some of the lncRNAs are showed differentially expressed between the wild-type and Klf1 knockout strains, suggested that those novel lncRNAs can be given high priority in further functional studies.
引用
收藏
页数:12
相关论文
共 46 条
[1]   The eukaryotic genome as an RNA machine [J].
Amaral, Paulo P. ;
Dinger, Marcel E. ;
Mercer, Tim R. ;
Mattick, John S. .
SCIENCE, 2008, 319 (5871) :1787-1789
[2]  
[Anonymous], GENOME RES
[3]   Long noncoding RNAs: the search for function [J].
Baker, Monya .
NATURE METHODS, 2011, 8 (05) :379-383
[4]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[5]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[6]   RNA meets chromatin [J].
Bernstein, E ;
Allis, CD .
GENES & DEVELOPMENT, 2005, 19 (14) :1635-1655
[7]   Global identification of human transcribed sequences with genome tiling arrays [J].
Bertone, P ;
Stolc, V ;
Royce, TE ;
Rozowsky, JS ;
Urban, AE ;
Zhu, XW ;
Rinn, JL ;
Tongprasit, W ;
Samanta, M ;
Weissman, S ;
Gerstein, M ;
Snyder, M .
SCIENCE, 2004, 306 (5705) :2242-2246
[8]   Making whole genome multiple alignments usable for biologists [J].
Blankenberg, Daniel ;
Taylor, James ;
Nekrutenko, Anton .
BIOINFORMATICS, 2011, 27 (17) :2426-2428
[9]   NONCODE v3.0: integrative annotation of long noncoding RNAs [J].
Bu, Dechao ;
Yu, Kuntao ;
Sun, Silong ;
Xie, Chaoyong ;
Skogerbo, Geir ;
Miao, Ruoyu ;
Xiao, Hui ;
Liao, Qi ;
Luo, Haitao ;
Zhao, Guoguang ;
Zhao, Haitao ;
Liu, Zhiyong ;
Liu, Changning ;
Chen, Runsheng ;
Zhao, Yi .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D210-D215
[10]   Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses [J].
Cabili, Moran N. ;
Trapnell, Cole ;
Goff, Loyal ;
Koziol, Magdalena ;
Tazon-Vega, Barbara ;
Regev, Aviv ;
Rinn, John L. .
GENES & DEVELOPMENT, 2011, 25 (18) :1915-1927