Differential expression in RNA-seq: A matter of depth

被引:1182
|
作者
Tarazona, Sonia [1 ,2 ]
Garcia-Alcalde, Fernando [1 ]
Dopazo, Joaquin [1 ]
Ferrer, Alberto
Conesa, Ana [1 ]
机构
[1] Ctr Invest Principe Felipe, Bioinformat & Genom Dept, Valencia 46012, Spain
[2] Univ Politecn Valencia, Dept Appl Stat Operat Res & Qual, Valencia 46022, Spain
关键词
TRANSCRIPTIONAL LANDSCAPE; GENE; REPRODUCIBILITY; POLYADENYLATION; GENOME;
D O I
10.1101/gr.124321.111
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Next-generation sequencing (NGS) technologies are revolutionizing genome research, and in particular, their application to transcriptomics (RNA-seq) is increasingly being used for gene expression profiling as a replacement for microarrays. However, the properties of RNA-seq data have not been yet fully established, and additional research is needed for understanding how these data respond to differential expression analysis. In this work, we set out to gain insights into the characteristics of RNA-seq data analysis by studying an important parameter of this technology: the sequencing depth. We have analyzed how sequencing depth affects the detection of transcripts and their identification as differentially expressed, looking at aspects such as transcript biotype, length, expression level, and fold-change. We have evaluated different algorithms available for the analysis of RNA-seq and proposed a novel approach-NOISeq-that differs from existing methods in that it is data-adaptive and nonparametric. Our results reveal that most existing methodologies suffer from a strong dependency on sequencing depth for their differential expression calls and that this results in a considerable number of false positives that increases as the number of reads grows. In contrast, our proposed method models the noise distribution from the actual data, can therefore better adapt to the size of the data set, and is more effective in controlling the rate of false discoveries. This work discusses the true potential of RNA-seq for studying regulation at low expression ranges, the noise within RNA-seq data, and the issue of replication.
引用
收藏
页码:2213 / 2223
页数:11
相关论文
共 50 条
  • [31] A scaling normalization method for differential expression analysis of RNA-seq data
    Robinson, Mark D.
    Oshlack, Alicia
    GENOME BIOLOGY, 2010, 11 (03):
  • [33] Differential gene expression analysis using coexpression and RNA-Seq data
    Yang, Ei-Wen
    Girke, Thomas
    Jiang, Tao
    BIOINFORMATICS, 2013, 29 (17) : 2153 - 2161
  • [34] Impact of human gene annotations on RNA-seq differential expression analysis
    Hamaguchi, Yu
    Zeng, Chao
    Hamada, Michiaki
    BMC GENOMICS, 2021, 22 (01)
  • [35] Identification and visualization of differential isoform expression in RNA-seq time series
    Nueda, Maria Jose
    Martorell-Marugan, Jordi
    Marti, Cristina
    Tarazona, Sonia
    Conesa, Ana
    BIOINFORMATICS, 2018, 34 (03) : 524 - 526
  • [36] A scaling normalization method for differential expression analysis of RNA-seq data
    Mark D Robinson
    Alicia Oshlack
    Genome Biology, 11
  • [37] Differential Expression Analysis in RNA-seq Data Using a Geometric Approach
    Tambonis, Tiago
    Boareto, Marcelo
    Leite, Vitor B. P.
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2018, 25 (11) : 1257 - 1265
  • [38] Differential Expression Analysis of RNA-seq Reads: Overview, Taxonomy, and Tools
    Chowdhury, Hussain Ahmed
    Bhattacharyya, Dhruba Kumar
    Kalita, Jugal Kumar
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (02) : 566 - 586
  • [39] Comparison of software packages for detecting differential expression in RNA-seq studies
    Seyednasrollah, Fatemeh
    Laiho, Asta
    Elo, Laura L.
    BRIEFINGS IN BIOINFORMATICS, 2015, 16 (01) : 59 - 70
  • [40] A fuzzy method for RNA-Seq differential expression analysis in presence of multireads
    Consiglio, Arianna
    Mencar, Corrado
    Grillo, Giorgio
    Marzano, Flaviana
    Caratozzolo, Mariano Francesco
    Liuni, Sabino
    BMC BIOINFORMATICS, 2016, 17