Comprehensive evaluation of de novo transcriptome assembly programs and their effects on differential gene expression analysis

被引:51
|
作者
Wang, Sufang [1 ]
Gribskov, Michael [1 ,2 ]
机构
[1] Purdue Univ, Dept Biol Sci, W Lafayette, IN 47907 USA
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
关键词
RNA-SEQ DATA; QUANTIFICATION;
D O I
10.1093/bioinformatics/btw625
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: With the decreased cost of RNA-Seq, an increasing number of non-model organisms have been sequenced. Due to the lack of reference genomes, de novo transcriptome assembly is required. However, there is limited systematic research evaluating the quality of de novo transcriptome assemblies and how the assembly quality influences downstream analysis. Results: We used two authentic RNA-Seq datasets from Arabidopsis thaliana, and produced transcriptome assemblies using eight programs with a series of k-mer sizes (from 25 to 71), including BinPacker, Bridger, IDBA-tran, Oases-Velvet, SOAPdenovo-Trans, SSP, Trans-ABySS and Trinity. We measured the assembly quality in terms of reference genome base and gene coverage, transcriptome assembly base coverage, number of chimeras and number of recovered full-length transcripts. SOAPdenovo-Trans performed best in base coverage, while Trans-ABySS performed best in gene coverage and number of recovered full-length transcripts. In terms of chimeric sequences, BinPacker and Oases-Velvet were the worst, while IDBA-tran, SOAPdenovo-Trans, Trans-ABySS and Trinity produced fewer chimeras across all single k-mer assemblies. In differential gene expression analysis, about 70% of the significantly differentially expressed genes (DEG) were the same using reference genome and de novo assemblies. We further identify four reasons for the differences in significant DEG between reference genome and de novo transcriptome assemblies: incomplete annotation, exon level differences, transcript fragmentation and incorrect gene annotation, which we suggest that de novo assembly is beneficial even when a reference genome is available.
引用
收藏
页码:327 / 333
页数:7
相关论文
共 50 条
  • [41] De Novo Assembly and Characterization of the Transcriptome of the Chinese Medicinal Herb, Gentiana rigescens
    Zhang, Xiaodong
    Allan, Andrew C.
    Li, Caixia
    Wang, Yuanzhong
    Yao, Qiuyang
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2015, 16 (05): : 11550 - 11573
  • [42] De Novo Assembly and Characterization of Fruit Transcriptome in Black Pepper (Piper nigrum)
    Hu, Lisong
    Hao, Chaoyun
    Fan, Rui
    Wu, Baoduo
    Tan, Lehe
    Wu, Huasong
    PLOS ONE, 2015, 10 (06):
  • [43] Ballgown bridges the gap between transcriptome assembly and expression analysis
    Frazee, Alyssa C.
    Pertea, Geo
    Jaffe, Andrew E.
    Langmead, Ben
    Salzberg, Steven L.
    Leek, Jeffrey T.
    NATURE BIOTECHNOLOGY, 2015, 33 (03) : 243 - 246
  • [44] De novo transcriptome sequencing of Acer palmatum and comprehensive analysis of differentially expressed genes under salt stress in two contrasting genotypes
    Rong, Liping
    Li, Qianzhong
    Li, Shushun
    Tang, Ling
    Wen, Jing
    MOLECULAR GENETICS AND GENOMICS, 2016, 291 (02) : 575 - 586
  • [45] De Novo Transcriptome Assembly from Inflorescence of Orchis italica: Analysis of Coding and Non-Coding Transcripts
    De Paolo, Sofia
    Salvemini, Marco
    Gaudio, Luciano
    Aceto, Serena
    PLOS ONE, 2014, 9 (07):
  • [46] De novo assembly and comparative transcriptome analysis: novel insights into terpenoid biosynthesis in Chamaemelum nobile L.
    Liu, Xiaomeng
    Wang, Xiaohui
    Chen, Zexiong
    Ye, Jiabao
    Liao, Yongling
    Zhang, Weiwei
    Chang, Jie
    Xu, Feng
    PLANT CELL REPORTS, 2019, 38 (01) : 101 - 116
  • [47] RNA-seq, de novo transcriptome assembly and flavonoid gene analysis in 13 wild and cultivated berry fruit species with high content of phenolics
    Thole, Vera
    Bassard, Jean-Etienne
    Ramirez-Gonzalez, Ricardo
    Trick, Martin
    Afshar, Bijan Ghasemi
    Breitel, Dario
    Hill, Lionel
    Foito, Alexandre
    Shepherd, Louise
    Freitag, Sabine
    dos Santos, Claudia Nunes
    Menezes, Regina
    Banados, Pilar
    Naesby, Michael
    Wang, Liangsheng
    Sorokin, Artem
    Tikhonova, Olga
    Shelenga, Tatiana
    Stewart, Derek
    Vain, Philippe
    Martin, Cathie
    BMC GENOMICS, 2019, 20 (01)
  • [48] De novo assembly and annotation of the retinal transcriptome for the Nile grass rat (Arvicanthis ansorgei)
    Liu, Melissa M.
    Farkas, Michael
    Spinnhirny, Perrine
    Pevet, Paul
    Pierce, Eric
    Hicks, David
    Zack, Donald J.
    PLOS ONE, 2017, 12 (07):
  • [49] Sequencing, de novo assembly and annotation of Digitalis ferruginea subsp. schischkinii transcriptome
    Unlu, Ercan Selcuk
    Kaya, Ozge
    Eker, Ismail
    Gurel, Ekrem
    MOLECULAR BIOLOGY REPORTS, 2021, 48 (01) : 127 - 137
  • [50] De novo assembly of the Carcinus maenas transcriptome and characterization of innate immune system pathways
    Verbruggen, Bas
    Bickley, Lisa K.
    Santos, Eduarda M.
    Tyler, Charles R.
    Stentiford, Grant D.
    Bateman, Kelly S.
    van Aerle, Ronny
    BMC GENOMICS, 2015, 16