The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome

被引:6
|
作者
Hoang, Nam, V [1 ]
Furtado, Agnelo [2 ]
Perlo, Virginie [2 ]
Botha, Frederik C. [2 ,3 ]
Henry, Robert J. [2 ]
机构
[1] Hue Univ, Coll Agr & Forestry, Hue, Vietnam
[2] Univ Queensland, Queensland Alliance Agr & Food Innovat, St Lucia, Qld, Australia
[3] Sugar Res Australia, Indooroopilly, Qld, Australia
关键词
isoform sequencing; transcriptome normalization; transcript enrichment; normalization impact; sugarcane transcriptome; polyploid transcriptome; DUPLEX-SPECIFIC NUCLEASE; MESSENGER-RNA; NONCODING RNAS; WEB SERVER; GENOME; SUGARCANE; IDENTIFICATION; ANNOTATION; SACCHARUM; PATHWAYS;
D O I
10.3389/fgene.2019.00654
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Normalization of cDNA is widely used to improve the coverage of rare transcripts in analysis of transcriptomes employing next-generation sequencing. Recently, long-read technology has been emerging as a powerful tool for sequencing and construction of transcriptomes, especially for complex genomes containing highly similar transcripts and transcript-spliced isoforms. Here, we analyzed the transcriptome of sugarcane, a highly polyploidy plant genome, by PacBio isoform sequencing (Iso-Seq) of two different cDNA library preparations, with and without a normalization step. The results demonstrated that, while the two libraries included many of the same transcripts, many longer transcripts were removed, and many new generally shorter transcripts were detected by normalization. For the same input cDNA and data yield, the normalized library recovered more total transcript isoforms and number of predicted gene families and orthologous groups, resulting in a higher representation for the sugarcane transcriptome, compared to the non-normalized library. The non-normalized library, on the other hand, included a wider transcript length range with more longer transcripts above similar to 1.25 kb and more transcript isoforms per gene family and gene ontology terms per transcript. A large proportion of the unique transcripts comprising similar to 52% of the normalized library were expressed at a lower level than the unique transcripts from the non-normalized library, across three tissue types tested including leaf, stalk, and root. About 83% of the total 5,348 predicted long noncoding transcripts was derived from the normalized library, of which similar to 80% was derived from the lowly expressed fraction. Functional annotation of the unique transcripts suggested that each library enriched different functional transcript fractions. This demonstrated the complementation of the two approaches in obtaining a complete transcriptome of a complex genome at the sequencing depth used in this study.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Transcriptome innovations in primates revealed by single-molecule long-read sequencing
    Ferrandez-Peral, Luis
    Zhan, Xiaoyu
    Alvarez-Estape, Marina
    Chiva, Cristina
    Esteller-Cucala, Paula
    Garcia-Perez, Raquel
    Julia, Eva
    Lizano, Esther
    Fornas, Oscar
    Sabido, Eduard
    Li, Qiye
    Marques-Bonet, Tomas
    Juan, David
    Zhang, Guojie
    GENOME RESEARCH, 2022, 32 (08) : 1448 - 1462
  • [22] Long-read sequencing of Chrysanthemum morifolium transcriptome reveals flavonoid biosynthesis and regulation
    Tao Wang
    Feng Yang
    Qiaosheng Guo
    Qingjun Zou
    Wenyan Zhang
    Lin Zuo
    Plant Growth Regulation, 2020, 92 : 559 - 569
  • [23] A global survey of alternative splicing of HBV transcriptome using long-read sequencing
    Guan, Guiwen
    Zou, Jun
    Zhang, Ting
    Lu, Fengmin
    Chen, Xiangmei
    JOURNAL OF HEPATOLOGY, 2022, 76 (01) : 234 - 236
  • [24] Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing
    Wang, Bo
    Tseng, Elizabeth
    Regulski, Michael
    Clark, Tyson A.
    Hon, Ting
    Jiao, Yinping
    Lu, Zhenyuan
    Olson, Andrew
    Stein, Joshua C.
    Ware, Doreen
    NATURE COMMUNICATIONS, 2016, 7
  • [25] Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing
    Bo Wang
    Elizabeth Tseng
    Michael Regulski
    Tyson A Clark
    Ting Hon
    Yinping Jiao
    Zhenyuan Lu
    Andrew Olson
    Joshua C. Stein
    Doreen Ware
    Nature Communications, 7
  • [26] Multiple Long-Read Sequencing Survey of Herpes Simplex Virus Dynamic Transcriptome
    Tombacz, Dora
    Moldovan, Norbert
    Balazs, Zsolt
    Gulyas, Gabor
    Csabai, Zsolt
    Boldogkoi, Miklos
    Snyder, Michael
    Boklogkoi, Zsolt
    FRONTIERS IN GENETICS, 2019, 10
  • [27] Long-read sequencing of the human cytomegalovirus transcriptome with the Pacific Biosciences RSII platform
    Zsolt Balázs
    Dóra Tombácz
    Attila Szűcs
    Michael Snyder
    Zsolt Boldogkői
    Scientific Data, 4
  • [28] Long-read sequencing of the human cytomegalovirus transcriptome with the Pacific Biosciences RSII platform
    Balazs, Zsolt
    Tombacz, Dora
    Szucs, Attila
    Snyder, Michael
    Boldogkoi, Zsolt
    SCIENTIFIC DATA, 2017, 4
  • [29] Lytic Transcriptome Dataset of Varicella Zoster Virus Generated by Long-Read Sequencing
    Tombacz, Dora
    Prazsak, Istvan
    Moldovan, Norbert
    Szucs, Attila
    Boldogkoi, Zsolt
    FRONTIERS IN GENETICS, 2018, 9
  • [30] Single-molecule long-read sequencing facilitates shrimp transcriptome research
    Zeng, Digang
    Chen, Xiuli
    Peng, Jinxia
    Yang, Chunling
    Peng, Min
    Zhu, Weilin
    Xie, Daxiang
    He, Pingping
    Wei, Pinyuan
    Lin, Yong
    Zhao, Yongzhen
    Chen, Xiaohan
    SCIENTIFIC REPORTS, 2018, 8