Improved definition of the mouse transcriptome via targeted RNA sequencing

被引:23
作者
Bussotti, Giovanni [1 ,7 ]
Leonardi, Tommaso [1 ,2 ]
Clark, Michael B. [2 ,3 ]
Mercer, Tim R. [4 ]
Crawford, Joanna [5 ]
Malquori, Lorenzo [5 ]
Notredame, Cedric [6 ]
Dinger, Marcel E. [2 ,4 ]
Mattick, John S. [2 ,4 ]
Enright, Anton J. [1 ]
机构
[1] EMBL, European Bioinformat Inst, Cambridge CB10 1SD, England
[2] Garvan Inst Med Res, Sydney, NSW 2010, Australia
[3] Univ Oxford, Dept Physiol Anat & Genet, MRC Funct Genom Unit, Oxford OX1 3PT, England
[4] UNSW Australia, St Vincents Clin Sch, Sydney, NSW 2052, Australia
[5] Univ Queensland, Inst Mol Biosci, Brisbane, Qld 4072, Australia
[6] CRG, Comparat Bioinformat Bioinformat & Genom Program, Barcelona 08003, Spain
[7] Inst Pasteur, Hub Bioinformat & Biostat, C3BI, F-75724 Paris 15, France
基金
英国医学研究理事会; 澳大利亚国家健康与医学研究理事会; 英国生物技术与生命科学研究理事会;
关键词
LONG NONCODING RNAS; LINKED MENTAL-RETARDATION; GENOME ANNOTATION; TM4SF2; GENE; SEQ DATA; REVEALS; ALIGNMENT; RECONSTRUCTION; DISCOVERY; BROWSER;
D O I
10.1101/gr.199760.115
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Targeted RNA sequencing (CaptureSeq) uses oligonucleotide probes to capture RNAs for sequencing, providing enriched read coverage, accurate measurement of gene expression, and quantitative expression data. We applied CaptureSeq to refine transcript annotations in the current murine GRCm38 assembly. More than 23,000 regions corresponding to putative or annotated long noncoding RNAs (lncRNAs) and 154,281 known splicing junction sites were selected for targeted sequencing across five mouse tissues and three brain subregions. The results illustrate that the mouse transcriptome is considerably more complex than previously thought. We assemble more complete transcript isoforms than GENCODE, expand transcript boundaries, and connect interspersed islands of mapped reads. We describe a novel filtering pipeline that identifies previously unannotated but high-quality transcript isoforms. In this set, 911 GENCODE neighboring genes are condensed into 400 expanded gene models. Additionally, 594 GENCODE lncRNAs acquire an open reading frame (ORF) when their structure is extended with CaptureSeq. Finally, we validate our observations using current FANTOM and Mouse ENCODE resources.
引用
收藏
页码:705 / 716
页数:12
相关论文
共 80 条
[11]   Detecting and Comparing Non-Coding RNAs in the High-Throughput Era [J].
Bussotti, Giovanni ;
Notredame, Cedric ;
Enright, Anton J. .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2013, 14 (08) :15423-15458
[12]   Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses [J].
Cabili, Moran N. ;
Trapnell, Cole ;
Goff, Loyal ;
Koziol, Magdalena ;
Tazon-Vega, Barbara ;
Regev, Aviv ;
Rinn, John L. .
GENES & DEVELOPMENT, 2011, 25 (18) :1915-1927
[13]   The transcriptional landscape of the mammalian genome [J].
Carninci, P ;
Kasukawa, T ;
Katayama, S ;
Gough, J ;
Frith, MC ;
Maeda, N ;
Oyama, R ;
Ravasi, T ;
Lenhard, B ;
Wells, C ;
Kodzius, R ;
Shimokawa, K ;
Bajic, VB ;
Brenner, SE ;
Batalov, S ;
Forrest, ARR ;
Zavolan, M ;
Davis, MJ ;
Wilming, LG ;
Aidinis, V ;
Allen, JE ;
Ambesi-Impiombato, X ;
Apweiler, R ;
Aturaliya, RN ;
Bailey, TL ;
Bansal, M ;
Baxter, L ;
Beisel, KW ;
Bersano, T ;
Bono, H ;
Chalk, AM ;
Chiu, KP ;
Choudhary, V ;
Christoffels, A ;
Clutterbuck, DR ;
Crowe, ML ;
Dalla, E ;
Dalrymple, BP ;
de Bono, B ;
Della Gatta, G ;
di Bernardo, D ;
Down, T ;
Engstrom, P ;
Fagiolini, M ;
Faulkner, G ;
Fletcher, CF ;
Fukushima, T ;
Furuno, M ;
Futaki, S ;
Gariboldi, M .
SCIENCE, 2005, 309 (5740) :1559-1563
[14]  
Clark MB, 2015, NAT METHODS, V12, P339, DOI [10.1038/NMETH.3321, 10.1038/nmeth.3321]
[15]   Genome-wide analysis of long noncoding RNA stability [J].
Clark, Michael B. ;
Johnston, Rebecca L. ;
Inostroza-Ponta, Mario ;
Fox, Archa H. ;
Fortini, Ellen ;
Moscato, Pablo ;
Dinger, Marcel E. ;
Mattick, John S. .
GENOME RESEARCH, 2012, 22 (05) :885-898
[16]   The Reality of Pervasive Transcription [J].
Clark, Michael B. ;
Amaral, Paulo P. ;
Schlesinger, Felix J. ;
Dinger, Marcel E. ;
Taft, Ryan J. ;
Rinn, John L. ;
Ponting, Chris P. ;
Stadler, Peter F. ;
Morris, Kevin V. ;
Morillon, Antonin ;
Rozowsky, Joel S. ;
Gerstein, Mark B. ;
Wahlestedt, Claes ;
Hayashizaki, Yoshihide ;
Carninci, Piero ;
Gingeras, Thomas R. ;
Mattick, John S. .
PLOS BIOLOGY, 2011, 9 (07)
[17]   Annotating genomes with massive-scale RNA sequencing [J].
Denoeud, France ;
Aury, Jean-Marc ;
Da Silva, Corinne ;
Noel, Benjamin ;
Rogier, Odile ;
Delledonne, Massimo ;
Morgante, Michele ;
Valle, Giorgio ;
Wincker, Patrick ;
Scarpelli, Claude ;
Jaillon, Olivier ;
Artiguenave, Francois .
GENOME BIOLOGY, 2008, 9 (12)
[18]   The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression [J].
Derrien, Thomas ;
Johnson, Rory ;
Bussotti, Giovanni ;
Tanzer, Andrea ;
Djebali, Sarah ;
Tilgner, Hagen ;
Guernec, Gregory ;
Martin, David ;
Merkel, Angelika ;
Knowles, David G. ;
Lagarde, Julien ;
Veeravalli, Lavanya ;
Ruan, Xiaoan ;
Ruan, Yijun ;
Lassmann, Timo ;
Carninci, Piero ;
Brown, James B. ;
Lipovich, Leonard ;
Gonzalez, Jose M. ;
Thomas, Mark ;
Davis, Carrie A. ;
Shiekhattar, Ramin ;
Gingeras, Thomas R. ;
Hubbard, Tim J. ;
Notredame, Cedric ;
Harrow, Jennifer ;
Guigo, Roderic .
GENOME RESEARCH, 2012, 22 (09) :1775-1789
[19]   X-linked protocadherin 19 mutations cause female-limited epilepsy and cognitive impairment [J].
Dibbens, Leanne M. ;
Tarpey, Patrick S. ;
Hynes, Kim ;
Bayly, Marta A. ;
Scheffer, Ingrid E. ;
Smith, Raffaella ;
Bomar, Jamee ;
Sutton, Edwina ;
Vandeleur, Lucianne ;
Shoubridge, Cheryl ;
Edkins, Sarah ;
Turner, Samantha J. ;
Stevens, Claire ;
O'Meara, Sarah ;
Tofts, Calli ;
Barthorpe, Syd ;
Buck, Gemma ;
Cole, Jennifer ;
Halliday, Kelly ;
Jones, David ;
Lee, Rebecca ;
Madison, Mark ;
Mironenko, Tatiana ;
Varian, Jennifer ;
West, Sofie ;
Widaa, Sara ;
Wray, Paul ;
Teague, John ;
Dicks, Ed ;
Butler, Adam ;
Menzies, Andrew ;
Jenkinson, Andrew ;
Shepherd, Rebecca ;
Gusella, James F. ;
Afawi, Zaid ;
Mazarib, Aziz ;
Neufeld, Miriam Y. ;
Kivity, Sara ;
Lev, Dorit ;
Lerman-Sagie, Tally ;
Korczyn, Amos D. ;
Derry, Christopher P. ;
Sutherland, Grant R. ;
Friend, Kathryn ;
Shaw, Marie ;
Corbett, Mark ;
Kim, Hyung-Goo ;
Geschwind, Daniel H. ;
Thomas, Paul ;
Haan, Eric .
NATURE GENETICS, 2008, 40 (06) :776-781
[20]   Using GOstats to test gene lists for GO term association [J].
Falcon, S. ;
Gentleman, R. .
BIOINFORMATICS, 2007, 23 (02) :257-258