Large-scale dependent multiple testing via hidden semi-Markov models

被引:0
|
作者
Wang, Jiangzhou [1 ]
Wang, Pengfei [2 ]
机构
[1] Shenzhen Univ, Inst Stat Sci, Coll Math & Stat, Shenzhen 518060, Peoples R China
[2] Dongbei Univ Finance & Econ, Sch Stat, Dalian 116025, Peoples R China
关键词
FDR; Hidden semi-Markov model; Multiple testing; FALSE DISCOVERY RATE; EMPIRICAL BAYES;
D O I
10.1007/s00180-023-01367-z
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Large-scale multiple testing is common in the statistical analysis of high-dimensional data. Conventional multiple testing procedures usually implicitly assumed that the tests are independent. However, this assumption is rarely established in many practical applications, particularly in "high-throughput" data analysis. Incorporating dependence structure information among tests can improve statistical power and interpretability of discoveries. In this paper, we propose a new large-scale dependent multiple testing procedure based on the hidden semi-Markov model (HSMM), which characterizes local correlations among tests using a semi-Markov process instead of a first-order Markov chain. Our novel approach allows for the number of consecutive null hypotheses to follow any reasonable distribution, enabling a more accurate description of complex local correlations. We show that the proposed procedure minimizes the marginal false non-discovery rate (mFNR) at the same marginal false discovery rate (mFDR) level. To reduce the computational complexity of the HSMM, we make use of the hidden Markov model (HMM) with an expanded state space to approximate it. We provide a forward-backward algorithm and an expectation-maximization (EM) algorithm for implementing the proposed procedure. Finally, we demonstrate the superior performance of the SMLIS procedure through extensive simulations and a real data analysis.
引用
收藏
页码:1093 / 1126
页数:34
相关论文
共 50 条
  • [21] OVERLAPPED STATE HIDDEN SEMI-MARKOV MODEL FOR GROUPED MULTIPLE SEQUENCES
    Narimatsu, Hiromi
    Kasai, Hiroyuki
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3397 - 3401
  • [22] Equipment health diagnosis and prognosis using hidden semi-Markov models
    Ming Dong
    David He
    Prashant Banerjee
    Jonathan Keller
    The International Journal of Advanced Manufacturing Technology, 2006, 30 : 738 - 749
  • [23] Large-Scale Multiple Testing of Correlations
    Cai, T. Tony
    Liu, Weidong
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (513) : 229 - 240
  • [24] Extended likelihood approach to large-scale multiple testing
    Lee, Youngjo
    Bjornstad, Jan F.
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2013, 75 (03) : 553 - 575
  • [25] Multiple Testing for Neuroimaging via Hidden Markov Random Field
    Shu, Hai
    Nan, Bin
    Koeppe, Robert
    BIOMETRICS, 2015, 71 (03) : 741 - 750
  • [26] Multiple testing in genome-wide association studies via hidden Markov models
    Wei, Zhi
    Sun, Wenguang
    Wang, Kai
    Hakonarson, Hakon
    BIOINFORMATICS, 2009, 25 (21) : 2802 - 2808
  • [27] Detection of LDoS Attacks Based on Wavelet Energy Entropy and Hidden Semi-Markov Models
    Wu Z.-J.
    Li H.-J.
    Liu L.
    Zhang J.-A.
    Yue M.
    Lei J.
    Ruan Jian Xue Bao/Journal of Software, 2020, 31 (05): : 1549 - 1562
  • [28] MULTIPLE TESTING VIA FDRL FOR LARGE-SCALE IMAGING DATA
    Zhang, Chunming
    Fan, Jianqing
    Yu, Tao
    ANNALS OF STATISTICS, 2011, 39 (01) : 613 - 642
  • [29] Hidden Semi-Markov Models-Based Visual Perceptual State Recognition for Pilots
    Gao, Lina
    Wang, Changyuan
    Wu, Gongpu
    SENSORS, 2023, 23 (14)
  • [30] Hidden Markov model in multiple testing on dependent count data
    Su, Weizhe
    Wang, Xia
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2020, 90 (05) : 889 - 906