A comparison across non-model animals suggests an optimal sequencing depth for de novo transcriptome assembly

被引:70
作者
Francis, Warren R. [1 ,2 ]
Christianson, Lynne M. [1 ]
Kiko, Rainer [3 ]
Powers, Meghan L. [1 ,2 ]
Shaner, Nathan C. [4 ]
Haddock, Steven H. D. [1 ]
机构
[1] Monterey Bay Aquarium Res Inst, Moss Landing, CA 95039 USA
[2] Univ Calif Santa Cruz, Dept Ocean Sci, Santa Cruz, CA 95064 USA
[3] GEOMAR, Helmholtz Ctr Ocean Res Kiel, D-24105 Kiel, Germany
[4] Scintillon Inst, San Diego, CA 92121 USA
关键词
RNA-SEQ DATA; DIFFERENTIAL EXPRESSION; GENES; NORMALIZATION;
D O I
10.1186/1471-2164-14-167
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: The lack of genomic resources can present challenges for studies of non-model organisms. Transcriptome sequencing offers an attractive method to gather information about genes and gene expression without the need for a reference genome. However, it is unclear what sequencing depth is adequate to assemble the transcriptome de novo for these purposes. Results: We assembled transcriptomes of animals from six different phyla (Annelids, Arthropods, Chordates, Cnidarians, Ctenophores, and Molluscs) at regular increments of reads using Velvet/Oases and Trinity to determine how read count affects the assembly. This included an assembly of mouse heart reads because we could compare those against the reference genome that is available. We found qualitative differences in the assemblies of whole-animals versus tissues. With increasing reads, whole-animal assemblies show rapid increase of transcripts and discovery of conserved genes, while single-tissue assemblies show a slower discovery of conserved genes though the assembled transcripts were often longer. A deeper examination of the mouse assemblies shows that with more reads, assembly errors become more frequent but such errors can be mitigated with more stringent assembly parameters. Conclusions: These assembly trends suggest that representative assemblies are generated with as few as 20 million reads for tissue samples and 30 million reads for whole-animals for RNA-level coverage. These depths provide a good balance between coverage and noise. Beyond 60 million reads, the discovery of new genes is low and sequencing errors of highly-expressed genes are likely to accumulate. Finally, siphonophores (polymorphic Cnidarians) are an exception and possibly require alternate assembly strategies.
引用
收藏
页码:1 / 12
页数:11
相关论文
共 28 条
[1]  
[Anonymous], BIOINF ADV ACCESS
[2]   De novo assembly of Euphorbia fischeriana root transcriptome identifies prostratin pathway related genes [J].
Barrero, Roberto A. ;
Chapman, Brett ;
Yang, Yanfang ;
Moolhuijzen, Paula ;
Keeble-Gagnere, Gabriel ;
Zhang, Nan ;
Tang, Qi ;
Bellgard, Matthew I. ;
Qiu, Deyou .
BMC GENOMICS, 2011, 12
[3]   Current-generation high-throughput sequencing: deepening insights into mammalian transcriptomes [J].
Blencowe, Benjamin J. ;
Ahmad, Sidrah ;
Lee, Leo J. .
GENES & DEVELOPMENT, 2009, 23 (12) :1379-1386
[4]   Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments [J].
Bullard, James H. ;
Purdom, Elizabeth ;
Hansen, Kasper D. ;
Dudoit, Sandrine .
BMC BIOINFORMATICS, 2010, 11
[5]   BLAST plus : architecture and applications [J].
Camacho, Christiam ;
Coulouris, George ;
Avagyan, Vahram ;
Ma, Ning ;
Papadopoulos, Jason ;
Bealer, Kevin ;
Madden, Thomas L. .
BMC BIOINFORMATICS, 2009, 10
[6]   Stem cell transcriptome profiling via massive-scale mRNA sequencing [J].
Cloonan, Nicole ;
Forrest, Alistair R. R. ;
Kolle, Gabriel ;
Gardiner, Brooke B. A. ;
Faulkner, Geoffrey J. ;
Brown, Mellissa K. ;
Taylor, Darrin F. ;
Steptoe, Anita L. ;
Wani, Shivangi ;
Bethel, Graeme ;
Robertson, Alan J. ;
Perkins, Andrew C. ;
Bruce, Stephen J. ;
Lee, Clarence C. ;
Ranade, Swati S. ;
Peckham, Heather E. ;
Manning, Jonathan M. ;
McKernan, Kevin J. ;
Grimmond, Sean M. .
NATURE METHODS, 2008, 5 (07) :613-619
[7]   De Novo Transcriptome Sequencing in Anopheles funestus Using Illumina RNA-Seq Technology [J].
Crawford, Jacob E. ;
Guelbeogo, Wamdaogo M. ;
Sanou, Antoine ;
Traore, Alphonse ;
Vernick, Kenneth D. ;
Sagnon, N'Fale ;
Lazzaro, Brian P. .
PLOS ONE, 2010, 5 (12)
[8]   Siphonophores [J].
Dunn, Casey .
CURRENT BIOLOGY, 2009, 19 (06) :R233-R234
[9]   Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance [J].
Feldmeyer, Barbara ;
Wheat, Christopher W. ;
Krezdorn, Nicolas ;
Rotter, Bjoern ;
Pfenninger, Markus .
BMC GENOMICS, 2011, 12
[10]   De Novo Assembly of Chickpea Transcriptome Using Short Reads for Gene Discovery and Marker Identification [J].
Garg, Rohini ;
Patel, Ravi K. ;
Tyagi, Akhilesh K. ;
Jain, Mukesh .
DNA RESEARCH, 2011, 18 (01) :53-63