Transcriptome diversity is a systematic source of variation in RNA-sequencing data

被引:10
|
作者
Garcia-Nieto, Pablo [1 ]
Wang, Ban [1 ]
Fraser, Hunter B. [1 ]
机构
[1] Stanford Univ, Dept Biol, Stanford, CA 94305 USA
关键词
SEQ DATA; EXPRESSION; NORMALIZATION;
D O I
10.1371/journal.pcbi.1009939
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
RNA sequencing has been widely used as an essential tool to probe gene expression. While standard practices have been established to analyze RNA-seq data, it is still challenging to interpret and remove artifactual signals. Several biological and technical factors such as sex, age, batches, and sequencing technology have been found to bias these estimates. Probabilistic estimation of expression residuals (PEER), which infers broad variance components in gene expression measurements, has been used to account for some systematic effects, but it has remained challenging to interpret these PEER factors. Here we show that transcriptome diversity-a simple metric based on Shannon entropy-explains a large portion of variability in gene expression and is the strongest known factor encoded in PEER factors. We then show that transcriptome diversity has significant associations with multiple technical and biological variables across diverse organisms and datasets. In sum, transcriptome diversity provides a simple explanation for a major source of variation in both gene expression estimates and PEER covariates. Author summaryAlthough the cells in every individual organism have nearly identical DNA sequences, they differ substantially in their function-for instance, neurons are very different from muscle cells. This is in large part because different genes are transcribed from DNA into RNA, a key step in the process known as gene expression. The measurement of RNA levels is an important tool in studying biology, but is complicated by many potentially confounding factors. To account for this, computational methods can correct for unknown confounders, but these do not provide any information about what these confounders are. Here we show that transcriptome diversity-a simple metric based on Shannon entropy-explains a large portion of variability in both gene expression measurements as well as the confounding factors detected by a leading method. This prevalent factor provides a simple explanation for a primary source of variation in gene expression estimates.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] RNA-sequencing reveals early, dynamic transcriptome changes in the corollas of pollinated petunias
    Shaun R Broderick
    Saranga Wijeratne
    Asela J Wijeratn
    Laura J Chapin
    Tea Meulia
    Michelle L Jones
    BMC Plant Biology, 14
  • [22] An RNA-sequencing transcriptome of the rodent Schwann cell response to peripheral nerve injury
    Amanda Brosius Lutz
    Tawaun A. Lucas
    Glenn A. Carson
    Christine Caneda
    Lu Zhou
    Ben A. Barres
    Marion S. Buckwalter
    Steven A. Sloan
    Journal of Neuroinflammation, 19
  • [23] Chemerin effect on transcriptome of the porcine endometrium during implantation determined by RNA-sequencing†
    Orzechowska, Kinga
    Kopij, Grzegorz
    Paukszto, Lukasz
    Dobrzyn, Kamil
    Kiezun, Marta
    Jastrzebski, Jan
    Kaminski, Tadeusz
    Smolinska, Nina
    BIOLOGY OF REPRODUCTION, 2022, 107 (02) : 557 - 573
  • [24] Deconvolution of expression for nascent RNA-sequencing data (DENR) highlights pre-RNA isoform diversity in human cells
    Zhao, Yixin
    Dukler, Noah
    Barshad, Gilad
    Toneyan, Shushan
    Danko, Charles G.
    Siepel, Adam
    BIOINFORMATICS, 2021, 37 (24) : 4727 - 4736
  • [25] Impact of Hydrogen on the Transcriptome of Sinorhizobium meliloti 1021 Using RNA-sequencing Technology
    Liu, Ruirui
    Li, Lulu
    Li, Zhiying
    Wang, Weiwei
    POLISH JOURNAL OF MICROBIOLOGY, 2020, 69 (01) : 39 - 48
  • [26] Transcriptome analysis of eutopic endometrial stromal cells in women with adenomyosis by RNA-sequencing
    Gan, Lin
    Li, Yongrong
    Chen, Yan
    Huang, Meihua
    Cao, Jian
    Cao, Meiling
    Wang, Zhihui
    Wan, Guiping
    Gui, Tao
    BIOENGINEERED, 2022, 13 (05) : 12637 - 12649
  • [27] Transcriptome analysis of the compatible interaction of tomato with Verticillium dahliae using RNA-sequencing
    Tan, Guangxuan
    Liu, Kun
    Kang, Jingmin
    Xu, Kedong
    Zhang, Yi
    Hu, Lizong
    Zhang, Ju
    Li, Chengwei
    FRONTIERS IN PLANT SCIENCE, 2015, 6
  • [28] PATERNAL TRANSCRIPTOME ANALYSIS BY RNA-SEQUENCING AS A MEASURE OF EMBRYONIC DEVELOPMENTAL POTENTIAL.
    Cozzubbo, T.
    Neri, Q. V.
    Rosenwaks, Z.
    Palermo, G. D.
    FERTILITY AND STERILITY, 2015, 104 (03) : E298 - E298
  • [29] RNA-SEQUENCING REVEALS ASTROCYTE TRANSCRIPTOME ALTERATIONS IN RESPONSE TO CHRONIC ETHANOL EXPOSURE
    Erickson, E. K.
    Farris, S. P.
    Blednov, Y. A.
    Mayfield, Rd.
    Harris, R. A.
    ALCOHOL, 2017, 60 : 227 - 227
  • [30] RNA-sequencing analysis of Trichophyton rubrum transcriptome in response to sublethal doses of acriflavine
    Persinoti, Gabriela Felix
    de Aguiar Peres, Nalu Teixeira
    Jacob, Tiago Rinaldi
    Rossi, Antonio
    Vencio, Ricardo Zorzetto
    Martinez-Rossi, Nilce Maria
    BMC GENOMICS, 2014, 15 : S1