Comprehensive evaluation of de novo transcriptome assembly programs and their effects on differential gene expression analysis

被引:51
|
作者
Wang, Sufang [1 ]
Gribskov, Michael [1 ,2 ]
机构
[1] Purdue Univ, Dept Biol Sci, W Lafayette, IN 47907 USA
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
关键词
RNA-SEQ DATA; QUANTIFICATION;
D O I
10.1093/bioinformatics/btw625
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: With the decreased cost of RNA-Seq, an increasing number of non-model organisms have been sequenced. Due to the lack of reference genomes, de novo transcriptome assembly is required. However, there is limited systematic research evaluating the quality of de novo transcriptome assemblies and how the assembly quality influences downstream analysis. Results: We used two authentic RNA-Seq datasets from Arabidopsis thaliana, and produced transcriptome assemblies using eight programs with a series of k-mer sizes (from 25 to 71), including BinPacker, Bridger, IDBA-tran, Oases-Velvet, SOAPdenovo-Trans, SSP, Trans-ABySS and Trinity. We measured the assembly quality in terms of reference genome base and gene coverage, transcriptome assembly base coverage, number of chimeras and number of recovered full-length transcripts. SOAPdenovo-Trans performed best in base coverage, while Trans-ABySS performed best in gene coverage and number of recovered full-length transcripts. In terms of chimeric sequences, BinPacker and Oases-Velvet were the worst, while IDBA-tran, SOAPdenovo-Trans, Trans-ABySS and Trinity produced fewer chimeras across all single k-mer assemblies. In differential gene expression analysis, about 70% of the significantly differentially expressed genes (DEG) were the same using reference genome and de novo assemblies. We further identify four reasons for the differences in significant DEG between reference genome and de novo transcriptome assemblies: incomplete annotation, exon level differences, transcript fragmentation and incorrect gene annotation, which we suggest that de novo assembly is beneficial even when a reference genome is available.
引用
收藏
页码:327 / 333
页数:7
相关论文
共 50 条
  • [21] De Novo Deep Transcriptome Analysis of Medicinal Plants for Gene Discovery in Biosynthesis of Plant Natural Products
    Han, R.
    Rai, A.
    Nakamura, M.
    Suzuki, H.
    Takahashi, H.
    Yamazaki, M.
    Saito, K.
    SYNTHETIC BIOLOGY AND METABOLIC ENGINEERING IN PLANTS AND MICROBES, PT B: METABOLISM IN PLANTS, 2016, 576 : 19 - 45
  • [22] De novo assembly and transcriptome characterization of spruce dwarf mistletoe Arceuthobium sichuanense uncovers gene expression profiling associated with plant development
    Wang, Yonglin
    Li, Xuewu
    Zhou, Weifen
    Li, Tao
    Tian, Chengming
    BMC GENOMICS, 2016, 17
  • [23] Transcriptome Analysis of Tomato Leaf Spot Pathogen Fusarium proliferatum: De novo Assembly, Expression Profiling, and Identification of Candidate Effectors
    Gao, Meiling
    Yao, Siyu
    Liu, Yang
    Yu, Haining
    Xu, Pinsan
    Sun, Wenhui
    Pu, Zhongji
    Hou, Hongman
    Bao, Yongming
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2018, 19 (01)
  • [24] De novo assembly of transcriptomes and differential gene expression analysis using short-read data from emerging model organisms - a brief guide
    Jackson, Daniel J.
    Cerveau, Nicolas
    Posnien, Nico
    FRONTIERS IN ZOOLOGY, 2024, 21 (01):
  • [25] De Novo Transcriptome Assembly for the Tropical Grass Panicum maximum Jacq
    Toledo-Silva, Guilherme
    Cardoso-Silva, Claudio Benicio
    Jank, Liana
    Souza, Anete Pereira
    PLOS ONE, 2013, 8 (07):
  • [26] De novo transcriptome sequencing and gene expression profiling of Elymus nutans under cold stress
    Fu, Juanjuan
    Miao, Yanjun
    Shao, Linhui
    Hu, Tianming
    Yang, Peizhi
    BMC GENOMICS, 2016, 17
  • [27] De novo transcriptome assembly of the lobster cockroach Nauphoeta cinerea (Blaberidae)
    Anversa Segatto, Ana Lacia
    Diesel, Jose Francisco
    Silva Loreto, Elgion Lucio
    Teixeira da Rocha, Joao Batista
    GENETICS AND MOLECULAR BIOLOGY, 2018, 41 (03) : 713 - 721
  • [28] De Novo assembly and annotation of the freshwater crayfish Astacus astacus transcriptome
    Theissinger, Kathrin
    Falckenhayn, Cassandra
    Blande, Daniel
    Toljamo, Anna
    Gutekunst, Julian
    Makkonen, Jenny
    Jussila, Japo
    Lyko, Frank
    Schrimpf, Anne
    Schulz, Ralf
    Kokko, Harri
    MARINE GENOMICS, 2016, 28 : 7 - 10
  • [29] De novo assembly and annotation of the Avicennia officinalis L. transcriptome
    Lyu, Haomin
    Li, Xinnian
    Guo, Zixiao
    He, Ziwen
    Shi, Suhua
    MARINE GENOMICS, 2018, 39 : 3 - 6
  • [30] Transcriptome analysis of the euryhaline alga, Prymnesium parvum (Prymnesiophyceae): effects of salinity on differential gene expression
    Talarski, Aimee
    Manning, Schonna R.
    La Claire, John W., II
    PHYCOLOGIA, 2016, 55 (01) : 33 - 44