SINGLE-INDEX MODULATED MULTIPLE TESTING

被引：16

作者：

Du, Lilun ^{[1
]}

Zhang, Chunming ^{[2
]}

机构：

[1] Univ Wisconsin, Dept Stat, Madison, WI 53706 USA

[2] Nankai Univ, Sch Math Sci, Tianjin 300071, Peoples R China

来源：

ANNALS OF STATISTICS | 2014年 / 42卷 / 04期

基金：

美国国家科学基金会;

关键词：

Bivariate normality; local false discovery rate; multiple comparison; p-value; simultaneous inference; symmetry property; FALSE DISCOVERY RATE; DNA COPY NUMBER; INTEGRATIVE ANALYSIS; PROSTATE-CANCER; MICROARRAY DATA; P-VALUES; RATES; POWER; EXPRESSION; GENES;

D O I：

10.1214/14-AOS1222

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

In the context of large-scale multiple testing, hypotheses are often accompanied with certain prior information. In this paper, we present a single-index modulated (SIM) multiple testing procedure, which maintains control of the false discovery rate while incorporating prior information, by assuming the availability of a bivariate p-value, (p(1), p(2)), for each hypothesis, where pi is a preliminary p-value from prior information and p(2) is the primary p-value for the ultimate analysis. To find the optimal rejection region for the bivariate p-value, we propose a criteria based on the ratio of probability density functions of (p(1), p(2)) under the true null and nonnull. This criteria in the bivariate normal setting further motivates us to project the bivariate p-value to a single-index, p(theta), for a wide range of directions theta. The true null distribution of p(theta) is estimated via parametric and nonparametric approaches, leading to two procedures for estimating and controlling the false discovery rate. To derive the optimal projection direction theta, we propose a new approach based on power comparison, which is further shown to be consistent under some mild conditions. Simulation evaluations indicate that the SIM multiple testing procedure improves the detection power significantly while controlling the false discovery rate. Analysis of a real dataset will be illustrated.

引用

页码：1262 / 1311

页数：50

共 36 条

[1] GOing Bayesian: model-based gene set analysis of genome-scale data [J].