Corset: enabling differential gene expression analysis for de novo assembled transcriptomes

被引:579
作者
Davidson, Nadia M. [1 ]
Oshlack, Alicia [1 ,2 ]
机构
[1] Royal Childrens Hosp, Murdoch Childrens Res Inst, Parkville, Vic 3052, Australia
[2] Univ Melbourne, Dept Genet, Melbourne, Vic, Australia
基金
澳大利亚国家健康与医学研究理事会;
关键词
RNA-SEQ DATA; GENERATION; READS; OPTIMIZATION; ANNOTATION; DISCOVERY; BLAST; WHEAT;
D O I
10.1186/s13059-014-0410-6
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Next generation sequencing has made it possible to perform differential gene expression studies in non-model organisms. For these studies, the need for a reference genome is circumvented by performing de novo assembly on the RNA-seq data. However, transcriptome assembly produces a multitude of contigs, which must be clustered into genes prior to differential gene expression detection. Here we present Corset, a method that hierarchically clusters contigs using shared reads and expression, then summarizes read counts to clusters, ready for statistical testing. Using a range of metrics, we demonstrate that Corset out-performs alternative methods. Corset is available from https://code.google.com/p/corset-project/.
引用
收藏
页数:14
相关论文
共 43 条
[21]   Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation [J].
McCarthy, Davis J. ;
Chen, Yunshun ;
Smyth, Gordon K. .
NUCLEIC ACIDS RESEARCH, 2012, 40 (10) :4288-4297
[22]   BLAST: at the core of a powerful and diverse set of sequence analysis tools [J].
McGinnis, S ;
Madden, TL .
NUCLEIC ACIDS RESEARCH, 2004, 32 :W20-W25
[23]   A comprehensive comparison of RNA-Seq-based transcriptome analysis from reads to differential gene expression and cross-comparison with microarrays: a case study in Saccharomyces cerevisiae [J].
Nookaew, Intawat ;
Papini, Marta ;
Pornputtapong, Natapol ;
Scalcinati, Gionata ;
Fagerberg, Linn ;
Uhlen, Matthias ;
Nielsen, Jens .
NUCLEIC ACIDS RESEARCH, 2012, 40 (20) :10084-10097
[24]   Characterisation of the wheat (triticum aestivum L.) transcriptome by de novo assembly for the discovery of phosphate starvation-responsive genes: gene expression in Pi-stressed wheat [J].
Oono, Youko ;
Kobayashi, Fuminori ;
Kawahara, Yoshihiro ;
Yazawa, Takayuki ;
Handa, Hirokazu ;
Itoh, Takeshi ;
Matsumoto, Takashi .
BMC GENOMICS, 2013, 14
[25]   From RNA-seq reads to differential expression results [J].
Oshlack, Alicia ;
Robinson, Mark D. ;
Young, Matthew D. .
GENOME BIOLOGY, 2010, 11 (12)
[26]   RNA sequencing: advances, challenges and opportunities [J].
Ozsolak, Fatih ;
Milos, Patrice M. .
NATURE REVIEWS GENETICS, 2011, 12 (02) :87-98
[27]   TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets [J].
Pertea, G ;
Huang, XQ ;
Liang, F ;
Antonescu, V ;
Sultana, R ;
Karamycheva, S ;
Lee, Y ;
White, J ;
Cheung, F ;
Parvizi, B ;
Tsai, J ;
Quackenbush, J .
BIOINFORMATICS, 2003, 19 (05) :651-652
[28]   De novo assembly and analysis of RNA-seq data [J].
Robertson, Gordon ;
Schein, Jacqueline ;
Chiu, Readman ;
Corbett, Richard ;
Field, Matthew ;
Jackman, Shaun D. ;
Mungall, Karen ;
Lee, Sam ;
Okada, Hisanaga Mark ;
Qian, Jenny Q. ;
Griffith, Malachi ;
Raymond, Anthony ;
Thiessen, Nina ;
Cezard, Timothee ;
Butterfield, Yaron S. ;
Newsome, Richard ;
Chan, Simon K. ;
She, Rong ;
Varhol, Richard ;
Kamoh, Baljit ;
Prabhu, Anna-Liisa ;
Tam, Angela ;
Zhao, YongJun ;
Moore, Richard A. ;
Hirst, Martin ;
Marra, Marco A. ;
Jones, Steven J. M. ;
Hoodless, Pamela A. ;
Birol, Inanc .
NATURE METHODS, 2010, 7 (11) :909-U62
[29]   edgeR: a Bioconductor package for differential expression analysis of digital gene expression data [J].
Robinson, Mark D. ;
McCarthy, Davis J. ;
Smyth, Gordon K. .
BIOINFORMATICS, 2010, 26 (01) :139-140
[30]   The head-regeneration transcriptome of the planarian Schmidtea mediterranea [J].
Sandmann, Thomas ;
Vogg, Matthias C. ;
Owlarn, Suthira ;
Boutros, Michael ;
Bartscherer, Kerstin .
GENOME BIOLOGY, 2011, 12 (08)