Tackling the widespread and critical impact of batch effects in high-throughput data

被引:1323
作者
Leek, Jeffrey T. [1 ]
Scharpf, Robert B. [2 ]
Bravo, Hector Corrada [1 ,3 ]
Simcha, David [4 ]
Langmead, Benjamin [1 ]
Johnson, W. Evan [5 ]
Geman, Donald [6 ]
Baggerly, Keith [7 ]
Irizarry, Rafael A. [1 ]
机构
[1] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Biostat, Baltimore, MD 21205 USA
[2] Johns Hopkins Univ, Dept Oncol, Baltimore, MD 21205 USA
[3] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
[4] Johns Hopkins Univ, Dept Biomed Engn, Baltimore, MD USA
[5] Brigham Young Univ, Dept Stat, Provo, UT 84602 USA
[6] Johns Hopkins Univ, Dept Appl Math & Stat, Baltimore, MD 21218 USA
[7] Univ Texas MD Anderson Canc Ctr, Dept Bioinformat & Computat Biol, Houston, TX 77230 USA
基金
美国国家卫生研究院;
关键词
GENE-EXPRESSION; PROTEOMIC PATTERNS; SERUM; NORMALIZATION;
D O I
10.1038/nrg2825
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
High-throughput technologies are widely used, for example to assay genetic variants, gene and protein expression, and epigenetic modifications. One often overlooked complication with such studies is batch effects, which occur because measurements are affected by laboratory conditions, reagent lots and personnel differences. This becomes a major problem when batch effects are correlated with an outcome of interest and lead to incorrect conclusions. Using both published studies and our own analyses, we argue that batch effects (as well as other technical and biological artefacts) are widespread and critical to address. We review experimental and computational approaches for doing so.
引用
收藏
页码:733 / 739
页数:7
相关论文
共 28 条
  • [1] On the design and analysis of gene expression studies in human populations
    Akey, Joshua M.
    Biswas, Shameek
    Leek, Jeffrey T.
    Storey, John D.
    [J]. NATURE GENETICS, 2007, 39 (07) : 807 - 808
  • [2] Microarray data analysis: from disarray to consolidation and consensus
    Allison, DB
    Cui, XQ
    Page, GP
    Sabripour, M
    [J]. NATURE REVIEWS GENETICS, 2006, 7 (01) : 55 - 65
  • [3] Singular value decomposition for genome-wide expression data processing and modeling
    Alter, O
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) : 10101 - 10106
  • [4] [Anonymous], 1963, PRINCIPLES NUMERICAL
  • [5] Baggerly KA, 2004, ENDOCR-RELAT CANCER, V11, P583, DOI 10.1677/erc.1.00868
  • [6] Run batch effects potentially compromise the usefulness of genomic signatures for ovarian cancer
    Baggerly, Keith A.
    Coombes, Kevin R.
    Neeley, E. Shannon
    [J]. JOURNAL OF CLINICAL ONCOLOGY, 2008, 26 (07) : 1186 - 1187
  • [7] A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
    Bolstad, BM
    Irizarry, RA
    Åstrand, M
    Speed, TP
    [J]. BIOINFORMATICS, 2003, 19 (02) : 185 - 193
  • [8] Comprehensive genomic characterization defines human glioblastoma genes and core pathways
    Chin, L.
    Meyerson, M.
    Aldape, K.
    Bigner, D.
    Mikkelsen, T.
    VandenBerg, S.
    Kahn, A.
    Penny, R.
    Ferguson, M. L.
    Gerhard, D. S.
    Getz, G.
    Brennan, C.
    Taylor, B. S.
    Winckler, W.
    Park, P.
    Ladanyi, M.
    Hoadley, K. A.
    Verhaak, R. G. W.
    Hayes, D. N.
    Spellman, Paul T.
    Absher, D.
    Weir, B. A.
    Ding, L.
    Wheeler, D.
    Lawrence, M. S.
    Cibulskis, K.
    Mardis, E.
    Zhang, Jinghui
    Wilson, R. K.
    Donehower, L.
    Wheeler, D. A.
    Purdom, E.
    Wallis, J.
    Laird, P. W.
    Herman, J. G.
    Schuebel, K. E.
    Weisenberger, D. J.
    Baylin, S. B.
    Schultz, N.
    Yao, Jun
    Wiedemeyer, R.
    Weinstein, J.
    Sander, C.
    Gibbs, R. A.
    Gray, J.
    Kucherlapati, R.
    Lander, E. S.
    Myers, R. M.
    Perou, C. M.
    McLendon, Roger
    [J]. NATURE, 2008, 455 (7216) : 1061 - 1068
  • [9] High-resolution serum proteomic features for ovarian cancer detection
    Conrads, TP
    Fusaro, VA
    Ross, S
    Johann, D
    Rajapakse, V
    Hitt, BA
    Steinberg, SM
    Kohn, EC
    Fishman, DA
    Whiteley, G
    Barrett, JC
    Liotta, LA
    Petricoin, EF
    Veenstra, TD
    [J]. ENDOCRINE-RELATED CANCER, 2004, 11 (02) : 163 - 178
  • [10] Cox M., 2008, Measur. Judgment Decis. Mak., P315, DOI [10.1007/978-3-540-33037-014, DOI 10.1007/978-3-540-33037-0_14]