OSAT: a tool for sample-to-batch allocations in genomics experiments

被引:41
作者
Yan, Li [1 ]
Ma, Changxing [2 ]
Wang, Dan [1 ]
Hu, Qiang [1 ]
Qin, Maochun [1 ]
Conroy, Jeffrey M. [3 ]
Sucheston, Lara E. [3 ]
Ambrosone, Christine B. [3 ]
Johnson, Candace S. [3 ]
Wang, Jianmin [1 ]
Liu, Song [1 ]
机构
[1] Roswell Pk Canc Inst, Dept Biostat & Bioinformat, Buffalo, NY 14263 USA
[2] SUNY Buffalo, Dept Biostat, Buffalo, NY 14214 USA
[3] Roswell Pk Canc Inst, Buffalo, NY 14263 USA
关键词
Genomic Study; Optimization Step; Batch Effect; Randomized Complete Block Design; Alternative Algorithm;
D O I
10.1186/1471-2164-13-689
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Batch effect is one type of variability that is not of primary interest but ubiquitous in sizable genomic experiments. To minimize the impact of batch effects, an ideal experiment design should ensure the even distribution of biological groups and confounding factors across batches. However, due to the practical complications, the availability of the final collection of samples in genomics study might be unbalanced and incomplete, which, without appropriate attention in sample-to-batch allocation, could lead to drastic batch effects. Therefore, it is necessary to develop effective and handy tool to assign collected samples across batches in an appropriate way in order to minimize the impact of batch effects. Results: We describe OSAT (Optimal Sample Assignment Tool), a bioconductor package designed for automated sample-to-batch allocations in genomics experiments. Conclusions: OSAT is developed to facilitate the allocation of collected samples to different batches in genomics study. Through optimizing the even distribution of samples in groups of biological interest into different batches, it can reduce the confounding or correlation between batches and the biological variables of interest. It can also optimize the homogeneous distribution of confounding factors across batches. It can handle challenging instances where incomplete and unbalanced sample collections are involved as well as ideally balanced designs.
引用
收藏
页数:7
相关论文
共 13 条
[1]   Run batch effects potentially compromise the usefulness of genomic signatures for ovarian cancer [J].
Baggerly, Keith A. ;
Coombes, Kevin R. ;
Neeley, E. Shannon .
JOURNAL OF CLINICAL ONCOLOGY, 2008, 26 (07) :1186-1187
[2]   Removing Batch Effects in Analysis of Expression Microarray Data: An Evaluation of Six Batch Adjustment Methods [J].
Chen, Chao ;
Grennan, Kay ;
Badner, Judith ;
Zhang, Dandan ;
Gershon, Elliot ;
Jin, Li ;
Liu, Chunyu .
PLOS ONE, 2011, 6 (02)
[3]  
Fang Kai-Tai., 2001, UNIFORM ORTHOGONAL D
[4]   R/DWD: distance-weighted discrimination for classification, visualization and batch adjustment [J].
Huang, Hanwen ;
Lu, Xiaosun ;
Liu, Yufeng ;
Haaland, Perry ;
Marron, J. S. .
BIOINFORMATICS, 2012, 28 (08) :1182-1183
[5]   Adjusting batch effects in microarray expression data using empirical Bayes methods [J].
Johnson, W. Evan ;
Li, Cheng ;
Rabinovic, Ariel .
BIOSTATISTICS, 2007, 8 (01) :118-127
[6]   Learning from our GWAS mistakes: from experimental design to scientific method [J].
Lambert, Christophe G. ;
Black, Laura J. .
BIOSTATISTICS, 2012, 13 (02) :195-203
[7]   Tackling the widespread and critical impact of batch effects in high-throughput data [J].
Leek, Jeffrey T. ;
Scharpf, Robert B. ;
Bravo, Hector Corrada ;
Simcha, David ;
Langmead, Benjamin ;
Johnson, W. Evan ;
Geman, Donald ;
Baggerly, Keith ;
Irizarry, Rafael A. .
NATURE REVIEWS GENETICS, 2010, 11 (10) :733-739
[8]  
Ma CX, 1999, METRIKA, V50, P255
[9]  
Mak H.Craig., 2011, NAT BIOTECHNOL, V29, P331
[10]   DNA Methylation Array Analysis Identifies Profiles of Blood-Derived DNA Methylation Associated With Bladder Cancer [J].
Marsit, Carmen J. ;
Koestler, Devin C. ;
Christensen, Brock C. ;
Karagas, Margaret R. ;
Houseman, E. Andres ;
Kelsey, Karl T. .
JOURNAL OF CLINICAL ONCOLOGY, 2011, 29 (09) :1133-1139