Differential expression in RNA-seq: A matter of depth

被引:1182
|
作者
Tarazona, Sonia [1 ,2 ]
Garcia-Alcalde, Fernando [1 ]
Dopazo, Joaquin [1 ]
Ferrer, Alberto
Conesa, Ana [1 ]
机构
[1] Ctr Invest Principe Felipe, Bioinformat & Genom Dept, Valencia 46012, Spain
[2] Univ Politecn Valencia, Dept Appl Stat Operat Res & Qual, Valencia 46022, Spain
关键词
TRANSCRIPTIONAL LANDSCAPE; GENE; REPRODUCIBILITY; POLYADENYLATION; GENOME;
D O I
10.1101/gr.124321.111
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Next-generation sequencing (NGS) technologies are revolutionizing genome research, and in particular, their application to transcriptomics (RNA-seq) is increasingly being used for gene expression profiling as a replacement for microarrays. However, the properties of RNA-seq data have not been yet fully established, and additional research is needed for understanding how these data respond to differential expression analysis. In this work, we set out to gain insights into the characteristics of RNA-seq data analysis by studying an important parameter of this technology: the sequencing depth. We have analyzed how sequencing depth affects the detection of transcripts and their identification as differentially expressed, looking at aspects such as transcript biotype, length, expression level, and fold-change. We have evaluated different algorithms available for the analysis of RNA-seq and proposed a novel approach-NOISeq-that differs from existing methods in that it is data-adaptive and nonparametric. Our results reveal that most existing methodologies suffer from a strong dependency on sequencing depth for their differential expression calls and that this results in a considerable number of false positives that increases as the number of reads grows. In contrast, our proposed method models the noise distribution from the actual data, can therefore better adapt to the size of the data set, and is more effective in controlling the rate of false discoveries. This work discusses the true potential of RNA-seq for studying regulation at low expression ranges, the noise within RNA-seq data, and the issue of replication.
引用
收藏
页码:2213 / 2223
页数:11
相关论文
共 50 条
  • [1] Power analysis for RNA-Seq differential expression studies
    Yu, Lianbo
    Fernandez, Soledad
    Brock, Guy
    BMC BIOINFORMATICS, 2017, 18
  • [2] Differential expression analysis for paired RNA-seq data
    Chung, Lisa M.
    Ferguson, John P.
    Zheng, Wei
    Qian, Feng
    Bruno, Vincent
    Montgomery, Ruth R.
    Zhao, Hongyu
    BMC BIOINFORMATICS, 2013, 14 : 110
  • [3] The impact of amplification on differential expression analyses by RNA-seq
    Swati Parekh
    Christoph Ziegenhain
    Beate Vieth
    Wolfgang Enard
    Ines Hellmann
    Scientific Reports, 6
  • [4] Identifying differential expression for RNA-seq data with no replication
    Gim, Jungsoo
    Park, Taesung
    2012 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2012,
  • [5] On Differential Gene Expression Using RNA-Seq Data
    Lee, Juhee
    Ji, Yuan
    Liang, Shoudan
    Cai, Guoshuai
    Mueller, Peter
    CANCER INFORMATICS, 2011, 10 : 205 - 215
  • [6] Power analysis for RNA-Seq differential expression studies
    Lianbo Yu
    Soledad Fernandez
    Guy Brock
    BMC Bioinformatics, 18
  • [7] From RNA-seq reads to differential expression results
    Alicia Oshlack
    Mark D Robinson
    Matthew D Young
    Genome Biology, 11
  • [8] From RNA-seq reads to differential expression results
    Oshlack, Alicia
    Robinson, Mark D.
    Young, Matthew D.
    GENOME BIOLOGY, 2010, 11 (12):
  • [9] Differential expression analysis for paired RNA-seq data
    Lisa M Chung
    John P Ferguson
    Wei Zheng
    Feng Qian
    Vincent Bruno
    Ruth R Montgomery
    Hongyu Zhao
    BMC Bioinformatics, 14
  • [10] The impact of amplification on differential expression analyses by RNA-seq
    Parekh, Swati
    Ziegenhain, Christoph
    Vieth, Beate
    Enard, Wolfgang
    Hellmann, Ines
    SCIENTIFIC REPORTS, 2016, 6