Comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data

被引:12
作者
Almeida-de-Macedo, Marcia M. [1 ]
Ransom, Nick [1 ]
Feng, Yaping [1 ]
Hurst, Jonathan [1 ]
Wurtele, Eve Syrkin [1 ]
机构
[1] Iowa State Univ, Dept Genet Dev & Cell Biol, Ames, IA 50011 USA
关键词
GENE-EXPRESSION PROFILES; METAANALYSIS; SIZE; CANCER; MODEL;
D O I
10.1186/1471-2105-14-214
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The synthesis of information across microarray studies has been performed by combining statistical results of individual studies (as in a mosaic), or by combining data from multiple studies into a large pool to be analyzed as a single data set (as in a melting pot of data). Specific issues relating to data heterogeneity across microarray studies, such as differences within and between labs or differences among experimental conditions, could lead to equivocal results in a melting pot approach. Results: We applied statistical theory to determine the specific effect of different means and heteroskedasticity across 19 groups of microarray data on the sign and magnitude of gene-to-gene Pearson correlation coefficients obtained from the pool of 19 groups. We quantified the biases of the pooled coefficients and compared them to the biases of correlations estimated by an effect-size model. Mean differences across the 19 groups were the main factor determining the magnitude and sign of the pooled coefficients, which showed largest values of bias as they approached +/-1. Only heteroskedasticity across the pool of 19 groups resulted in less efficient estimations of correlations than did a classical meta-analysis approach of combining correlation coefficients. These results were corroborated by simulation studies involving either mean differences or heteroskedasticity across a pool of N > 2 groups. Conclusions: The combination of statistical results is best suited for synthesizing the correlation between expression profiles of a gene pair across several microarray studies.
引用
收藏
页数:14
相关论文
共 34 条
[11]  
Gehlke CE, 1934, J AM STAT ASSOC, V29, P169, DOI 10.2307/2277827
[12]  
Goldstein DR, 2010, CH CRC MATH COMP BIO, P3
[13]  
Goldstein DR, 2010, CH CRC MATH COMP BIO, P135
[14]   Nonsensical and biased correlation due to pooling heterogeneous samples [J].
Hassler, U ;
Thadewald, T .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES D-THE STATISTICIAN, 2003, 52 :367-379
[15]  
Hedges L. V., 1985, STAT METHODS METAANA, DOI [10.1016/C2009-0-03396-0, DOI 10.1016/C2009-0-03396-0, 10.2307/1164953]
[16]   Annotating genes of known and unknown function by large-scale coexpression analysis [J].
Horan, Kevin ;
Jang, Charles ;
Bailey-Serres, Julia ;
Mittler, Ron ;
Shelton, Christian ;
Harper, Jeff F. ;
Zhu, Jian-Kang ;
Cushman, John C. ;
Gollery, Martin ;
Girke, Thomas .
PLANT PHYSIOLOGY, 2008, 147 (01) :41-57
[17]   Integrative analysis of multiple gene expression profiles with quality-adjusted effect size models [J].
Hu, PZ ;
Greenwood, CMT ;
Beyene, J .
BMC BIOINFORMATICS, 2005, 6 (1)
[18]   Multiple-laboratory comparison of microarray platforms [J].
Irizarry, RA ;
Warren, D ;
Spencer, F ;
Kim, IF ;
Biswal, S ;
Frank, BC ;
Gabrielson, E ;
Garcia, JGN ;
Geoghegan, J ;
Germino, G ;
Griffin, C ;
Hilmer, SC ;
Hoffman, E ;
Jedlicka, AE ;
Kawasaki, E ;
Martínez-Murillo, F ;
Morsberger, L ;
Lee, H ;
Petersen, D ;
Quackenbush, J ;
Scott, A ;
Wilson, M ;
Yang, YQ ;
Ye, SQ ;
Yu, W .
NATURE METHODS, 2005, 2 (05) :345-349
[19]   The AtGenExpress global stress expression data set: protocols, evaluation and model data analysis of UV-B light, drought and cold stress responses [J].
Kilian, Joachim ;
Whitehead, Dion ;
Horak, Jakub ;
Wanke, Dierk ;
Weinl, Stefan ;
Batistic, Oliver ;
D'Angelo, Cecilia ;
Bornberg-Bauer, Erich ;
Kudla, Joerg ;
Harter, Klaus .
PLANT JOURNAL, 2007, 50 (02) :347-363
[20]   Correlation analysis between genome-wide expression profiles and cytoarchitectural abnormalities in the prefrontal cortex of psychiatric disorders [J].
Kim, S. ;
Webster, M. J. .
MOLECULAR PSYCHIATRY, 2010, 15 (03) :326-336