A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis

被引:842
作者
Dillies, Marie-Agnes [1 ]
Rau, Andrea [1 ]
Aubert, Julie [1 ]
Hennequet-Antier, Christelle [1 ]
Jeanmougin, Marine [1 ]
Servant, Nicolas [1 ]
Keime, Celine [1 ]
Marot, Guillemette [1 ]
Castel, David [1 ]
Estelle, Jordi [1 ]
Guernec, Gregory [1 ]
Jagla, Bernd [1 ]
Jouneau, Luc [1 ]
Laloe, Denis [1 ]
Le Gall, Caroline [1 ]
Schaeffer, Brigitte [1 ]
Le Crom, Stephane [1 ]
Guedj, Mickael [1 ]
Jaffrezic, Florence [1 ]
机构
[1] Inst Pasteur, F-75724 Paris 15, France
关键词
high-throughput sequencing; RNA-seq; normalization; differential analysis; GENE-EXPRESSION; SEQ DATA; TRANSCRIPTOME; QUANTIFICATION; STRATEGY; REVEALS; MOUSE; ARRAY;
D O I
10.1093/bib/bbs046
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
During the last 3 years, a number of approaches for the normalization of RNA sequencing data have emerged in the literature, differing both in the type of bias adjustment and in the statistical strategy adopted. However, as data continue to accumulate, there has been no clear consensus on the appropriate normalization method to be used or the impact of a chosen method on the downstream analysis. In this work, we focus on a comprehensive comparison of seven recently proposed normalization methods for the differential analysis of RNA-seq data, with an emphasis on the use of varied real and simulated datasets involving different species and experimental designs to represent data characteristics commonly observed in practice. Based on this comparison study, we propose practical recommendations on the appropriate normalization method to be used and its impact on the differential analysis of RNA-seq data.
引用
收藏
页码:671 / 683
页数:13
相关论文
共 46 条
  • [1] Differential expression analysis for sequence count data
    Anders, Simon
    Huber, Wolfgang
    [J]. GENOME BIOLOGY, 2010, 11 (10):
  • [2] Detecting differential usage of exons from RNA-seq data
    Anders, Simon
    Reyes, Alejandro
    Huber, Wolfgang
    [J]. GENOME RESEARCH, 2012, 22 (10) : 2008 - 2017
  • [3] A Two-Stage Poisson Model for Testing RNA-Seq Data
    Auer, Paul L.
    Doerge, Rebecca W.
    [J]. STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2011, 10 (01)
  • [4] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [5] A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
    Bolstad, BM
    Irizarry, RA
    Åstrand, M
    Speed, TP
    [J]. BIOINFORMATICS, 2003, 19 (02) : 185 - 193
  • [6] Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments
    Bullard, James H.
    Purdom, Elizabeth
    Hansen, Kasper D.
    Dudoit, Sandrine
    [J]. BMC BIOINFORMATICS, 2010, 11
  • [7] Calza S, 2010, METHODS MOL BIOL, V673, P37, DOI 10.1007/978-1-60761-842-3_3
  • [8] Human housekeeping genes are compact
    Eisenberg, E
    Levanon, EY
    [J]. TRENDS IN GENETICS, 2003, 19 (07) : 362 - 365
  • [9] Full-length transcriptome assembly from RNA-Seq data without a reference genome
    Grabherr, Manfred G.
    Haas, Brian J.
    Yassour, Moran
    Levin, Joshua Z.
    Thompson, Dawn A.
    Amit, Ido
    Adiconis, Xian
    Fan, Lin
    Raychowdhury, Raktima
    Zeng, Qiandong
    Chen, Zehua
    Mauceli, Evan
    Hacohen, Nir
    Gnirke, Andreas
    Rhind, Nicholas
    di Palma, Federica
    Birren, Bruce W.
    Nusbaum, Chad
    Lindblad-Toh, Kerstin
    Friedman, Nir
    Regev, Aviv
    [J]. NATURE BIOTECHNOLOGY, 2011, 29 (07) : 644 - U130
  • [10] Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs
    Guttman, Mitchell
    Garber, Manuel
    Levin, Joshua Z.
    Donaghey, Julie
    Robinson, James
    Adiconis, Xian
    Fan, Lin
    Koziol, Magdalena J.
    Gnirke, Andreas
    Nusbaum, Chad
    Rinn, John L.
    Lander, Eric S.
    Regev, Aviv
    [J]. NATURE BIOTECHNOLOGY, 2010, 28 (05) : 503 - U166