Comprehensive evaluation of de novo transcriptome assembly programs and their effects on differential gene expression analysis

被引:51
|
作者
Wang, Sufang [1 ]
Gribskov, Michael [1 ,2 ]
机构
[1] Purdue Univ, Dept Biol Sci, W Lafayette, IN 47907 USA
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
关键词
RNA-SEQ DATA; QUANTIFICATION;
D O I
10.1093/bioinformatics/btw625
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: With the decreased cost of RNA-Seq, an increasing number of non-model organisms have been sequenced. Due to the lack of reference genomes, de novo transcriptome assembly is required. However, there is limited systematic research evaluating the quality of de novo transcriptome assemblies and how the assembly quality influences downstream analysis. Results: We used two authentic RNA-Seq datasets from Arabidopsis thaliana, and produced transcriptome assemblies using eight programs with a series of k-mer sizes (from 25 to 71), including BinPacker, Bridger, IDBA-tran, Oases-Velvet, SOAPdenovo-Trans, SSP, Trans-ABySS and Trinity. We measured the assembly quality in terms of reference genome base and gene coverage, transcriptome assembly base coverage, number of chimeras and number of recovered full-length transcripts. SOAPdenovo-Trans performed best in base coverage, while Trans-ABySS performed best in gene coverage and number of recovered full-length transcripts. In terms of chimeric sequences, BinPacker and Oases-Velvet were the worst, while IDBA-tran, SOAPdenovo-Trans, Trans-ABySS and Trinity produced fewer chimeras across all single k-mer assemblies. In differential gene expression analysis, about 70% of the significantly differentially expressed genes (DEG) were the same using reference genome and de novo assemblies. We further identify four reasons for the differences in significant DEG between reference genome and de novo transcriptome assemblies: incomplete annotation, exon level differences, transcript fragmentation and incorrect gene annotation, which we suggest that de novo assembly is beneficial even when a reference genome is available.
引用
收藏
页码:327 / 333
页数:7
相关论文
共 50 条
  • [1] De novo assembly and comparative analysis of the Ceratodon purpureus transcriptome
    Szoevenyi, Peter
    Perroud, Pierre-Francois
    Symeonidi, Aikaterini
    Stevenson, Sean
    Quatrano, Ralph S.
    Rensing, Stefan A.
    Cuming, Andrew C.
    McDaniel, Stuart F.
    MOLECULAR ECOLOGY RESOURCES, 2015, 15 (01) : 203 - 215
  • [2] Differential gene expression analysis between anagen and telogen of Capra hircus skin based on the de novo assembled transcriptome sequence
    Xu, Teng
    Guo, Xudong
    Wang, Hui
    Hao, Fei
    Du, Xiaoyuan
    Gao, Xiaoyu
    Liu, Dongjun
    GENE, 2013, 520 (01) : 30 - 38
  • [3] Corset: enabling differential gene expression analysis for de novo assembled transcriptomes
    Davidson, Nadia M.
    Oshlack, Alicia
    GENOME BIOLOGY, 2014, 15 (07):
  • [4] De Novo Transcriptome Analysis of Differential Functional Gene Expression in Largemouth Bass (Micropterus salmoides) after Challenge with Nocardia seriolae
    Byadgi, Omkar
    Chen, Chi-Wen
    Wang, Pei-Chyi
    Tsai, Ming-An
    Chen, Shih-Chu
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2016, 17 (08)
  • [5] De novo assembly and transcriptome analysis of sclerotial development in Wolfiporia cocos
    Wu, Yayun
    Zhu, Wenjun
    Wei, Wei
    Zhao, Xiaolong
    Wang, Qi
    Zeng, Wanyong
    Zheng, Yonglian
    Chen, Ping
    Zhang, Shaopeng
    GENE, 2016, 588 (02) : 149 - 155
  • [6] De-novo transcriptome assembly for gene identification, analysis, annotation, and molecular marker discovery in Onobrychis viciifolia
    Mora-Ortiz, Marina
    Swain, Martin T.
    Vickers, Martin J.
    Hegarty, Matthew J.
    Kelly, Rhys
    Smith, Lydia M. J.
    Skot, Leif
    BMC GENOMICS, 2016, 17
  • [7] Digital Gene Expression Analysis Based on Integrated De Novo Transcriptome Assembly of Sweet Potato [Ipomoea batatas (L.) Lam.]
    Tao, Xiang
    Gu, Ying-Hong
    Wang, Hai-Yan
    Zheng, Wen
    Li, Xiao
    Zhao, Chuan-Wu
    Zhang, Yi-Zheng
    PLOS ONE, 2012, 7 (04):
  • [8] Informed kmer selection for de novo transcriptome assembly
    Durai, Dilip A.
    Schulz, Marcel H.
    BIOINFORMATICS, 2016, 32 (11) : 1670 - 1677
  • [9] De Novo Assembly and Characterization of the Xenocatantops brachycerus Transcriptome
    Zhao, Le
    Zhang, Xinmei
    Qiu, Zhongying
    Huang, Yuan
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2018, 19 (02)
  • [10] A simple guide to de novo transcriptome assembly and annotation
    Raghavan, Venket
    Kraft, Louis
    Mesny, Fantin
    Rigerte, Linda
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (02)