A STATISTICAL FRAMEWORK FOR TESTING FUNCTIONAL CATEGORIES IN MICROARRAY DATA

被引:49
作者
Barry, William T. [1 ]
Nobel, Andrew B. [2 ]
Wright, Fred A. [3 ]
机构
[1] Duke Univ, Med Ctr, Dept Biostat & Bioinformat, Durham, NC 27710 USA
[2] Univ N Carolina, Dept Stat & Operat Res, Chapel Hill, NC 27599 USA
[3] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27599 USA
关键词
Differential expression; array permutation; bootstrap; Type; 1; error; power;
D O I
10.1214/07-AOAS146
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Ready access to emerging databases of gene annotation and functional pathways has shifted assessments of differential expression in DNA microarray studies from single genes to groups of genes with shared biological function. This paper takes a critical look at existing methods for assessing the differential expression of a group of genes (functional category), and provides some suggestions for improved performance. We begin by presenting a general framework, in which the set of genes in a functional category is compared to file complementary set of genes on the array. The framework includes tests for overrepresentation of a category within a list of significant genes, and methods that consider continuous measures, of differential expression. Existing tests are divided into two classes. Class 1 tests assume gene-specific measures of differential expression are independent, despite overwhelming evidence of positive correlation. Analytic and simulated results are presented that demonstrate Class 1 tests are strongly anti-conservative in practice. Class 2 tests account for gene correlation, typically through array permutation that by construction has proper Type 1 error control for the induced null. However, both Class 1 and Class 2 tests use a null hypothesis that all genes have file same degree of differential expression. We introduce a more sensible and general (Class 3) null Under which the profile of differential expression is the same within the category and complement. Under this broader null. Class 2 tests are shown to be conservative. We propose standard bootstrap methods for testing against the Class 3 null and demonstrate they provide valid Type 1 error control and more power than array permutation in simulated datasetsts and real microarray experiments.
引用
收藏
页码:286 / 315
页数:30
相关论文
共 34 条
[31]  
THOMAS GBJ, 1992, MAXIMA MINIMA SADDLE
[32]   Significance analysis of microarrays applied to the ionizing radiation response [J].
Tusher, VG ;
Tibshirani, R ;
Chu, G .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (09) :5116-5121
[33]   Expression profiling reveals fundamental biological differences in acute myeloid leukemia with isolated trisomy 8 and normal cytogenetics [J].
Virtaneva, K ;
Wright, FA ;
Tanner, SM ;
Yuan, B ;
Lemon, WJ ;
Caligiuri, MA ;
Bloomfield, CD ;
de la Chapelle, A ;
Krahe, R .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (03) :1124-1129
[34]  
Zhong Sheng, 2004, Appl Bioinformatics, V3, P261, DOI 10.2165/00822942-200403040-00009