Semi-parametric hidden Markov model for large-scale multiple testing under dependency

被引:0
作者
Kim, Joungyoun [1 ,4 ]
Lim, Johan [2 ]
Lee, Jong Soo [3 ]
机构
[1] Yonsei Univ, Coll Nursing, Mo Im Kim Nursing Res Inst, Seoul, South Korea
[2] Seoul Natl Univ, Dept Stat, Seoul, South Korea
[3] Univ Massachusetts, Dept Math Sci, Lowell, MA USA
[4] Univ Seoul, Dept Artificial Intelligence, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
false discovery rate; local index of significance; modified EM procedure; multiple testing; identifiability; semi-parametric hidden Markov model; FALSE DISCOVERY RATE; STATISTICAL-ANALYSIS; INFERENCE; ALGORITHM; COMPONENT;
D O I
10.1177/1471082X221121235
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this article, we propose a new semiparametric hidden Markov model (HMM) for use in the simultaneous hypothesis testing with dependency. The semi- or non-parametric HMM in the literature requires two conditions for its model identifiability, (a) the latent Markov chain (MC) is ergodic and its transition probability is full rank and (b) the observational distributions of different hidden states are disjoint or linearly independent. Unlike the existing models, our semiparametric HMM with two hidden states makes no assumption on the transition probability of the latent MC but assumes that observational distributions are extremal for the set of all stationary distributions of the model. To estimate the model, we propose a modified expectation-maximization algorithm, whose M-step has an additional purification step to make the observational distribution be extremal one. We numerically investigate the performance of the proposed procedure in the estimation of the model and compare it to two recent existing methods in various multiple testing error settings. In addition, we apply our procedure to analyzing two real data examples, the gas chromatography/mass spectrometry experiment to differentiate the origin of herbal medicine and the epidemiologic surveillance of an influenza-like illness.
引用
收藏
页码:320 / 343
页数:24
相关论文
共 37 条
[1]   Nonparametric identification and maximum likelihood estimation for hidden Markov models [J].
Alexandrovich, G. ;
Holzmann, H. ;
Leister, A. .
BIOMETRIKA, 2016, 103 (02) :423-434
[2]   IDENTIFIABILITY OF PARAMETERS IN LATENT STRUCTURE MODELS WITH MANY OBSERVED VARIABLES [J].
Allman, Elizabeth S. ;
Matias, Catherine ;
Rhode, John A. .
ANNALS OF STATISTICS, 2009, 37 (6A) :3099-3132
[3]   STATISTICAL INFERENCE FOR PROBABILISTIC FUNCTIONS OF FINITE STATE MARKOV CHAINS [J].
BAUM, LE ;
PETRIE, T .
ANNALS OF MATHEMATICAL STATISTICS, 1966, 37 (06) :1554-&
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]  
BESAG J, 1974, J ROY STAT SOC B MET, V36, P192
[6]  
BESAG J, 1986, J ROY STAT SOC B, V48, P259
[7]   A stochastic EM algorithm for a semiparametric mixture model [J].
Bordes, Laurent ;
Chauveau, Didier ;
Vandekerkhove, Pierre .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 51 (11) :5429-5443
[8]   Semiparametric estimation of a two-component mixture model where one component is known [J].
Bordes, Laurent ;
Delmas, Celine ;
Vandekerkhove, Pierre .
SCANDINAVIAN JOURNAL OF STATISTICS, 2006, 33 (04) :733-752
[9]   Semiparametric estimation of a two-component mixture model [J].
Bordes, Laurent ;
Mottelet, Stephane ;
Vandekerkhove, Pierre .
ANNALS OF STATISTICS, 2006, 34 (03) :1204-1232
[10]   Semiparametric Hidden Markov Models [J].
Dannemann, Joern .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2012, 21 (03) :677-692