Large-scale multiple testing via multivariate hidden Markov models

被引:0
作者
Hou, Zhiqiang [1 ]
Wang, Pengfei [2 ]
机构
[1] Shandong Univ Finance & Econ, Sch Stat, Jinan, Peoples R China
[2] Dongbei Univ Finance & Econ, Sch Stat, Dalian, Peoples R China
关键词
False discovery rate; Multivariate hidden Markov models; Multiple testing; FALSE DISCOVERY RATE; GENOME-WIDE ASSOCIATION; EMPIRICAL BAYES; SCHIZOPHRENIA;
D O I
10.1080/03610918.2022.2061001
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Large-scale multiple testing with correlated tests and auxiliary statistics arises in a wide range of scientific fields. Conventional multiple testing procedures largely ignored auxiliary information, such as sparsity information, and the dependence structure among tests. This may result in loss of testing efficiency. In this paper, we propose a procedure, called multivariate local index of significance (mvLIS) procedure, for large-scale multiple testing. The mvLIS procedure can not only characterize local correlations among tests via a Markov chain but also incorporates auxiliary information via multivariate statistics. We present that the oracle mvLIS procedure is valid, namely, it controls false discovery rate (FDR) at the pre-specified level, and show that it yields the smallest false non-discovery rate (FNR) at the same FDR level. Then a data-driven mvLIS procedure is developed to mimic the oracle procedure. Comprehensive simulation studies and a real data analysis of schizophrenia (SCZ) data are performed to illustrate the superior performance of the mvLIS procedure. Moreover, as a byproduct that is of independent interest, we generalize the single-index modulated (SIM) multiple testing procedure, which embeds prior information via 2-dimensional p-values, to allow for d-dimensional (d >= 3) statistics in multiple testing. The detailed extension is deferred to Discussion.
引用
收藏
页码:1932 / 1951
页数:20
相关论文
共 33 条