Assembly and annotation of a non-model gastropod (Nerita melanotragus) transcriptome: A comparison of de novo assemblers

被引:26
作者
Amin S. [1 ]
Prentis P.J. [2 ]
Gilding E.K. [3 ]
Pavasovic A. [1 ]
机构
[1] School of Biomedical Sciences, Faculty of Health, Queensland University of Technology, GPO Box 2434, Brisbane, 4001, QLD
[2] School of Earth, Environmental and Biological Sciences, Science and Engineering Faculty, Queensland University of Technology, GPO Box 2434, Brisbane, 4001, QLD
[3] Institute for Molecular Biosciences, University of Queensland, St Lucia, 4072, QLD
关键词
De novo assembly; Heat shock protein; Ion torrent; Nerita melanotragus; Transcriptome;
D O I
10.1186/1756-0500-7-488
中图分类号
学科分类号
摘要
Background: The sequencing, de novo assembly and annotation of transcriptome datasets generated with next generation sequencing (NGS) has enabled biologists to answer genomic questions in non-model species with unprecedented ease. Reliable and accurate de novo assembly and annotation of transcriptomes, however, is a critically important step for transcriptome assemblies generated from short read sequences. Typical benchmarks for assembly and annotation reliability have been performed with model species. To address the reliability and accuracy of de novo transcriptome assembly in non-model species, we generated an RNAseq dataset for an intertidal gastropod mollusc species, Nerita melanotragus, and compared the assembly produced by four different de novo transcriptome assemblers; Velvet, Oases, Geneious and Trinity, for a number of quality metrics and redundancy. Results: Transcriptome sequencing on the Ion Torrent PGM™ produced 1,883,624 raw reads with a mean length of 133 base pairs (bp). Both the Trinity and Oases de novo assemblers produced the best assemblies based on all quality metrics including fewer contigs, increased N50 and average contig length and contigs of greater length. Overall the BLAST and annotation success of our assemblies was not high with only 15-19% of contigs assigned a putative function. Conclusions: We believe that any improvement in annotation success of gastropod species will require more gastropod genome sequences, but in particular an increase in mollusc protein sequences in public databases. Overall, this paper demonstrates that reliable and accurate de novo transcriptome assemblies can be generated from short read sequencers with the right assembly algorithms. © 2014 Amin et al.; licensee BioMed Central Ltd.
引用
收藏
相关论文
共 29 条
[1]  
Hou R., Bao Z., Wang S., Su H., Li Y., Du H., Hu J., Wang S., Hu X., Transcriptome sequencing and de novo analysis for yesso scallop (Patinopecten yessoensis) using 454 GS FLX, PLoS One, 6, (2011)
[2]  
Ponder W.F., Lindberg D.R., Towards a phylogeny of gastropod molluscs: An analysis using morphological characters, Zool J Linnean Soc, 119, pp. 83-265, (1997)
[3]  
Peterson C.H., Recruitment overfishing in a bivalve mollusc fishery: Hard clams (Mercenaria mercenaria) in North Carolina, Can J Fish Aquat Sci, 59, pp. 96-104, (2002)
[4]  
Sadamoto H., Takahashi H., Okada T., Kenmoku H., Toyota M., Asakawa Y., De novo sequencing and transcriptome analysis of the central nervous system of mollusc lymnaea stagnalis by deep RNA sequencing, PLoS One, 7, (2012)
[5]  
Sattelle D.B., Buckingham S.D., Invertebrate studies and their ongoing contributions to neuroscience, Invert Neurosci, 6, pp. 1-3, (2006)
[6]  
Herpin A., Badariotti F., Rodet F., Favrel P., Molecular characterization of a new leucine-rich repeat-containing G protein-coupled receptor from a bivalve mollusc: Evolutionary implications, Biochim Biophys Acta Gene Struct Expr, 1680, pp. 137-144, (2004)
[7]  
Zhao X., Yu H., Kong L., Li Q., Transcriptomic responses to salinity stress in the pacific oyster Crassostrea gigas, PLoS One, 7, (2012)
[8]  
Pante E., Rohfritsch A., Becquet V., Belkhir K., Bierne N., Garcia P., SNP detection from de novo transcriptome sequencing in the bivalve macoma balthica: Marker development for evolutionary studies, PLoS One, 7, (2012)
[9]  
Fiedler T.J., Hudder A., McKay S.J., Shivkumar S., Capo T.R., Schmale M.C., Walsh P.J., The transcriptome of the early life history stages of the California sea hare Aplysia californica, Comparative Biochem Physiol Part D Genomics Prot, 5, pp. 165-170, (2010)
[10]  
Feng Z.-P., Zhang Z., Van Kesteren R.E., Straub V.A., Van Nierop P., Jin K., Nejatbakhsh N., Goldberg J.I., Spencer G.E., Yeoman M.S., Wildering W., Coorssen J.R., Croll R.P., Buck L.T., Syed N.I., Smit A.B., Transcriptome analysis of the central nervous system of the mollusc Lymnaea stagnalis, BMC Genomics, 10, (2009)