Removing Batch Effects in Analysis of Expression Microarray Data: An Evaluation of Six Batch Adjustment Methods

被引:341
作者
Chen, Chao [1 ,2 ]
Grennan, Kay [2 ]
Badner, Judith [2 ]
Zhang, Dandan [3 ]
Gershon, Elliot [2 ]
Jin, Li [1 ]
Liu, Chunyu [2 ]
机构
[1] Fudan Univ, Natl Minist Educ, Key Lab Contemporary Anthropol, Shanghai 200433, Peoples R China
[2] Univ Chicago, Dept Psychiat, Chicago, IL 60637 USA
[3] Zhejiang Univ, Dept Pathol, Hangzhou 310003, Zhejiang, Peoples R China
关键词
GENE-EXPRESSION; GENOME; ARRAY; HYBRIDIZATION; PERFORMANCE; MODEL;
D O I
10.1371/journal.pone.0017238
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The expression microarray is a frequently used approach to study gene expression on a genome-wide scale. However, the data produced by the thousands of microarray studies published annually are confounded by "batch effects," the systematic error introduced when samples are processed in multiple batches. Although batch effects can be reduced by careful experimental design, they cannot be eliminated unless the whole study is done in a single batch. A number of programs are now available to adjust microarray data for batch effects prior to analysis. We systematically evaluated six of these programs using multiple measures of precision, accuracy and overall performance. ComBat, an Empirical Bayes method, outperformed the other five programs by most metrics. We also showed that it is essential to standardize expression data at the probe level when testing for correlation of expression profiles, due to a sizeable probe effect in microarray data that can inflate the correlation among replicates and unrelated samples.
引用
收藏
页数:10
相关论文
共 35 条
[1]  
*AFF, AFF EXPR CONS SOFTW
[2]   Singular value decomposition for genome-wide expression data processing and modeling [J].
Alter, O ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) :10101-10106
[3]   Adjustment of systematic microarray data biases [J].
Benito, M ;
Parker, J ;
Du, Q ;
Wu, JY ;
Xang, D ;
Perou, CM ;
Marron, JS .
BIOINFORMATICS, 2004, 20 (01) :105-114
[4]   Sources of variation in baseline gene expression levels from toxicogenomics study control animals across multiple laboratories [J].
Boedigheimer, Michael J. ;
Wolfinger, Russell D. ;
Bass, Michael B. ;
Bushel, Pierre R. ;
Chou, Jeff W. ;
Cooper, Matthew ;
Corton, J. Christopher ;
Fostel, Jennifer ;
Hester, Susan ;
Lee, Janice S. ;
Liu, Fenglong ;
Liu, Jie ;
Qian, Hui-Rong ;
Quackenbush, John ;
Pettit, Syril ;
Thompson, Karol L. .
BMC GENOMICS, 2008, 9 (1)
[5]   Exploring the new world of the genome with DNA microarrays [J].
Brown, PO ;
Botstein, D .
NATURE GENETICS, 1999, 21 (Suppl 1) :33-37
[6]   Gene Expression Omnibus: NCBI gene expression and hybridization array data repository [J].
Edgar, R ;
Domrachev, M ;
Lash, AE .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :207-210
[7]   Effects of atmospheric ozone on microarray data quality [J].
Fare, TL ;
Coffey, EM ;
Dai, HY ;
He, YDD ;
Kessler, DA ;
Kilian, KA ;
Koch, JE ;
LeProust, E ;
Marton, MJ ;
Meyer, MR ;
Stoughton, RB ;
Tokiwa, GY ;
Wang, YQ .
ANALYTICAL CHEMISTRY, 2003, 75 (17) :4672-4675
[8]   Bioconductor: open software development for computational biology and bioinformatics [J].
Gentleman, RC ;
Carey, VJ ;
Bates, DM ;
Bolstad, B ;
Dettling, M ;
Dudoit, S ;
Ellis, B ;
Gautier, L ;
Ge, YC ;
Gentry, J ;
Hornik, K ;
Hothorn, T ;
Huber, W ;
Iacus, S ;
Irizarry, R ;
Leisch, F ;
Li, C ;
Maechler, M ;
Rossini, AJ ;
Sawitzki, G ;
Smith, C ;
Smyth, G ;
Tierney, L ;
Yang, JYH ;
Zhang, JH .
GENOME BIOLOGY, 2004, 5 (10)
[9]   A METHOD OF COMPARING THE AREAS UNDER RECEIVER OPERATING CHARACTERISTIC CURVES DERIVED FROM THE SAME CASES [J].
HANLEY, JA ;
MCNEIL, BJ .
RADIOLOGY, 1983, 148 (03) :839-843
[10]   Multiple-laboratory comparison of microarray platforms [J].
Irizarry, RA ;
Warren, D ;
Spencer, F ;
Kim, IF ;
Biswal, S ;
Frank, BC ;
Gabrielson, E ;
Garcia, JGN ;
Geoghegan, J ;
Germino, G ;
Griffin, C ;
Hilmer, SC ;
Hoffman, E ;
Jedlicka, AE ;
Kawasaki, E ;
Martínez-Murillo, F ;
Morsberger, L ;
Lee, H ;
Petersen, D ;
Quackenbush, J ;
Scott, A ;
Wilson, M ;
Yang, YQ ;
Ye, SQ ;
Yu, W .
NATURE METHODS, 2005, 2 (05) :345-349