Covariate-modulated large-scale multiple testing under dependence

被引:2
作者
Wang, Jiangzhou [1 ]
Cui, Tingting [2 ]
Zhu, Wensheng [2 ]
Wang, Pengfei [3 ]
机构
[1] Shenzhen Univ, Coll Math & Stat, Shenzhen 518060, Peoples R China
[2] Northeast Normal Univ, Sch Math & Stat, Key Lab Appl Stat MOE, Changchun, Peoples R China
[3] Dongbei Univ Finance & Econ, Sch Stat, Dalian 116025, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Covariate-modulated HMM; FDR; Local correlations; Large-scale multiple testing; FALSE DISCOVERY RATE; HIDDEN MARKOV-MODELS; GENOME-WIDE ASSOCIATION; MIXTURES; NUMBER;
D O I
10.1016/j.csda.2022.107664
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Large-scale multiple testing, which calls for conducting tens of thousands of hypothesis testings simultaneously, has been applied in many scientific fields. Most conventional multiple testing procedures often focused on the control of false discovery rate (FDR) and largely ignored covariate information and the dependence structure among tests. A FDR control procedure, termed as Covariate-Modulated Local Index of Significance (cmLIS) procedure, which not only takes into account local correlations among tests but also accommodates the covariate information by leveraging a covariate-modulated hidden Markov model (HMM), has been proposed. In the oracle case where all parameters of the covariate-modulated HMM are known, the cmLIS procedure is shown to be valid and optimal in some sense. According to whether the number of mixed components in the nonnull distribution is known, two Bayesian sampling algorithms are provided for parameter estimation. Extensive simulations are conducted to demonstrate the effectiveness of the cmLIS procedure over state-of-the-art multiple testing procedures. Finally, the cmLIS procedure is applied to an RNA sequencing data and a schizophrenia (SCZ) data. (c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] UGM: a more stable procedure for large-scale multiple testing problems, new solutions to identify oncogene
    Liu, Chengyou
    Zhou, Leilei
    Wang, Yuhe
    Tian, Shuchang
    Zhu, Junlin
    Qin, Hang
    Ding, Yong
    Jiang, Hongbing
    THEORETICAL BIOLOGY AND MEDICAL MODELLING, 2019, 16 (01)
  • [32] Correlation and large-scale simultaneous significance testing
    Efron, Bradley
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (477) : 93 - 103
  • [33] Statistical inference and large-scale multiple testing for high-dimensional regression models
    Cai, T. Tony
    Guo, Zijian
    Xia, Yin
    TEST, 2023, 32 (04) : 1135 - 1171
  • [34] Large-scale dependent multiple testing via hidden semi-Markov models
    Jiangzhou Wang
    Pengfei Wang
    Computational Statistics, 2024, 39 : 1093 - 1126
  • [35] Multiple Testing under Dependence via Semiparametric Graphical Models
    Liu, Jie
    Zhang, Chunming
    Burnside, Elizabeth
    Page, David
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 955 - 963
  • [36] Covariate-assisted ranking and screening for large-scale two-sample inference
    Cai, T. Tony
    Sun, Wenguang
    Wang, Weinan
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2019, 81 (02) : 187 - 234
  • [37] Signal Classification in Large-Scale Multi-Sequence Integrative Analysis Under the HMM Dependence
    Li, Wendong
    Xiang, Dongdong
    Chen, Gongtao
    Qiu, Peihua
    TECHNOMETRICS, 2024, 66 (02) : 182 - 195
  • [38] EFFECTS OF STATISTICAL DEPENDENCE ON MULTIPLE TESTING UNDER A HIDDEN MARKOV MODEL
    Chi, Zhiyi
    ANNALS OF STATISTICS, 2011, 39 (01) : 439 - 473
  • [39] Large-Scale Global and Simultaneous Inference: Estimation and Testing in Very High Dimensions
    Cai, T. Tony
    Sun, Wenguang
    ANNUAL REVIEW OF ECONOMICS, VOL 9, 2017, 9 : 411 - 439
  • [40] OPTIMAL RATES OF CONVERGENCE FOR ESTIMATING THE NULL DENSITY AND PROPORTION OF NONNULL EFFECTS IN LARGE-SCALE MULTIPLE TESTING
    Cai, T. Tony
    Jin, Jiashun
    ANNALS OF STATISTICS, 2010, 38 (01) : 100 - 145