Corset: enabling differential gene expression analysis for de novo assembled transcriptomes

被引:544
作者
Davidson, Nadia M. [1 ]
Oshlack, Alicia [1 ,2 ]
机构
[1] Royal Childrens Hosp, Murdoch Childrens Res Inst, Parkville, Vic 3052, Australia
[2] Univ Melbourne, Dept Genet, Melbourne, Vic, Australia
来源
GENOME BIOLOGY | 2014年 / 15卷 / 07期
基金
澳大利亚国家健康与医学研究理事会;
关键词
RNA-SEQ DATA; GENERATION; READS; OPTIMIZATION; ANNOTATION; DISCOVERY; BLAST; WHEAT;
D O I
10.1186/s13059-014-0410-6
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Next generation sequencing has made it possible to perform differential gene expression studies in non-model organisms. For these studies, the need for a reference genome is circumvented by performing de novo assembly on the RNA-seq data. However, transcriptome assembly produces a multitude of contigs, which must be clustered into genes prior to differential gene expression detection. Here we present Corset, a method that hierarchically clusters contigs using shared reads and expression, then summarizes read counts to clusters, ready for statistical testing. Using a range of metrics, we demonstrate that Corset out-performs alternative methods. Corset is available from https://code.google.com/p/corset-project/.
引用
收藏
页数:14
相关论文
共 43 条
  • [1] Differential expression analysis for sequence count data
    Anders, Simon
    Huber, Wolfgang
    [J]. GENOME BIOLOGY, 2010, 11 (10):
  • [2] RNA sequencing reveals sexually dimorphic gene expression before gonadal differentiation in chicken and allows comprehensive annotation of the W-chromosome
    Ayers, Katie L.
    Davidson, Nadia M.
    Demiyah, Diana
    Roeszler, Kelly N.
    Gruetzner, Frank
    Sinclair, Andrew H.
    Oshlack, Alicia
    Smith, Craig A.
    [J]. GENOME BIOLOGY, 2013, 14 (03):
  • [3] Brown CT, ARXIV12034802
  • [4] Open source clustering software
    de Hoon, MJL
    Imoto, S
    Nolan, J
    Miyano, S
    [J]. BIOINFORMATICS, 2004, 20 (09) : 1453 - 1454
  • [5] Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data
    Duan, Jialei
    Xia, Chuan
    Zhao, Guangyao
    Jia, Jizeng
    Kong, Xiuying
    [J]. BMC GENOMICS, 2012, 13
  • [6] A Mitogenomic Phylogeny of Living Primates
    Finstermeier, Knut
    Zinner, Dietmar
    Brameier, Markus
    Meyer, Matthias
    Kreuz, Eva
    Hofreiter, Michael
    Roos, Christian
    [J]. PLOS ONE, 2013, 8 (07):
  • [7] A comparison across non-model animals suggests an optimal sequencing depth for de novo transcriptome assembly
    Francis, Warren R.
    Christianson, Lynne M.
    Kiko, Rainer
    Powers, Meghan L.
    Shaner, Nathan C.
    Haddock, Steven H. D.
    [J]. BMC GENOMICS, 2013, 14 : 1 - 12
  • [8] CD-HIT: accelerated for clustering the next-generation sequencing data
    Fu, Limin
    Niu, Beifang
    Zhu, Zhengwei
    Wu, Sitao
    Li, Weizhong
    [J]. BIOINFORMATICS, 2012, 28 (23) : 3150 - 3152
  • [9] De Novo Assembly of Chickpea Transcriptome Using Short Reads for Gene Discovery and Marker Identification
    Garg, Rohini
    Patel, Ravi K.
    Tyagi, Akhilesh K.
    Jain, Mukesh
    [J]. DNA RESEARCH, 2011, 18 (01) : 53 - 63
  • [10] Full-length transcriptome assembly from RNA-Seq data without a reference genome
    Grabherr, Manfred G.
    Haas, Brian J.
    Yassour, Moran
    Levin, Joshua Z.
    Thompson, Dawn A.
    Amit, Ido
    Adiconis, Xian
    Fan, Lin
    Raychowdhury, Raktima
    Zeng, Qiandong
    Chen, Zehua
    Mauceli, Evan
    Hacohen, Nir
    Gnirke, Andreas
    Rhind, Nicholas
    di Palma, Federica
    Birren, Bruce W.
    Nusbaum, Chad
    Lindblad-Toh, Kerstin
    Friedman, Nir
    Regev, Aviv
    [J]. NATURE BIOTECHNOLOGY, 2011, 29 (07) : 644 - U130