A note on oligonucleotide expression values not being normally distributed

被引:24
作者
Hardin, Johanna [1 ]
Wilson, Jason [2 ]
机构
[1] Pomona Coll, Dept Math, Claremont, CA 91711 USA
[2] Biola Univ, Dept Math, La Mirada, CA 90639 USA
关键词
Affymetrix; Distributions; Microarray data; Nonnormality;
D O I
10.1093/biostatistics/kxp003
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Novel techniques for analyzing microarray data are constantly being developed. Though many of the methods contribute to biological discoveries, inability to properly evaluate the novel techniques limits their ability to advance science. Because the underlying distribution of microarray data is unknown, novel methods are typically tested against the assumed normal distribution. However, microarray data are not, in fact, normally distributed, and assuming so can have misleading consequences. Using an Affymetrix technical replicate spike-in data set, we show that oligonucleotide expression values are not normally distributed for any of the standard methods for calculating expression values. The resulting data tend to have a large proportion of skew and heavy tailed genes. Additionally, we show that standard methods can give unexpected and misleading results when the data are not well approximated by the normal distribution. Robust methods are therefore recommended when analyzing microarray data. Additionally, new techniques should be evaluated with skewed and/or heavy-tailed data distributions.
引用
收藏
页码:446 / 450
页数:5
相关论文
共 13 条
[1]  
Affymetrix Inc, 2005, GUID PROB LOG INT ER
[2]  
Affymetrix Inc, 2002, STAT ALG DESCR DOC
[3]   Normality of oligonucleotide microarray data and implications for parametric statistical analyses [J].
Giles, PJ ;
Kipling, D .
BIOINFORMATICS, 2003, 19 (17) :2254-2262
[4]   A robust measure of correlation between two genes on a microarray [J].
Hardin, Johanna ;
Mitani, Aya ;
Hicks, Leanne ;
VanKoten, Brian .
BMC BIOINFORMATICS, 2007, 8 (1)
[5]   2-SAMPLE ADAPTIVE DISTRIBUTION-FREE TEST [J].
HOGG, RV ;
FISHER, DM ;
RANDLES, RH .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1975, 70 (351) :656-661
[6]   Exploration, normalization, and summaries of high density oligonucleotide array probe level data [J].
Irizarry, RA ;
Hobbs, B ;
Collin, F ;
Beazer-Barclay, YD ;
Antonellis, KJ ;
Scherf, U ;
Speed, TP .
BIOSTATISTICS, 2003, 4 (02) :249-264
[7]   How high is the level of technical noise in microarray data? [J].
Klebanov, Lev ;
Yakovlev, Andrei .
BIOLOGY DIRECT, 2007, 2 (1)
[8]  
KOONIN EV, 2007, BIOL DIRECT, V2, P8
[9]  
MUSHEGIAN A, 2007, BIOL DIRECT, V2, P6
[10]   Higher plant glycosyltransferases [J].
Ross, Joe ;
Li, Yi ;
Lim, Eng-Kiat ;
Bowles, Dianna J. .
GENOME BIOLOGY, 2001, 2 (02)