Covariate-modulated large-scale multiple testing under dependence

被引:2
作者
Wang, Jiangzhou [1 ]
Cui, Tingting [2 ]
Zhu, Wensheng [2 ]
Wang, Pengfei [3 ]
机构
[1] Shenzhen Univ, Coll Math & Stat, Shenzhen 518060, Peoples R China
[2] Northeast Normal Univ, Sch Math & Stat, Key Lab Appl Stat MOE, Changchun, Peoples R China
[3] Dongbei Univ Finance & Econ, Sch Stat, Dalian 116025, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Covariate-modulated HMM; FDR; Local correlations; Large-scale multiple testing; FALSE DISCOVERY RATE; HIDDEN MARKOV-MODELS; GENOME-WIDE ASSOCIATION; MIXTURES; NUMBER;
D O I
10.1016/j.csda.2022.107664
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Large-scale multiple testing, which calls for conducting tens of thousands of hypothesis testings simultaneously, has been applied in many scientific fields. Most conventional multiple testing procedures often focused on the control of false discovery rate (FDR) and largely ignored covariate information and the dependence structure among tests. A FDR control procedure, termed as Covariate-Modulated Local Index of Significance (cmLIS) procedure, which not only takes into account local correlations among tests but also accommodates the covariate information by leveraging a covariate-modulated hidden Markov model (HMM), has been proposed. In the oracle case where all parameters of the covariate-modulated HMM are known, the cmLIS procedure is shown to be valid and optimal in some sense. According to whether the number of mixed components in the nonnull distribution is known, two Bayesian sampling algorithms are provided for parameter estimation. Extensive simulations are conducted to demonstrate the effectiveness of the cmLIS procedure over state-of-the-art multiple testing procedures. Finally, the cmLIS procedure is applied to an RNA sequencing data and a schizophrenia (SCZ) data. (c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] A non-randomized procedure for large-scale heterogeneous multiple discrete testing based on randomized tests
    Dai, Xiaoyu
    Lin, Nan
    Li, Daofeng
    Wang, Ting
    BIOMETRICS, 2019, 75 (02) : 638 - 649
  • [42] Large-Scale Simultaneous Testing Using Kernel Density Estimation
    Santu Ghosh
    Alan M. Polansky
    Sankhya A, 2022, 84 (2): : 808 - 843
  • [43] Large-Scale Simultaneous Testing Using Kernel Density Estimation
    Ghosh, Santu
    Polansky, Alan M.
    SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY, 2022, 84 (02): : 808 - 843
  • [44] Mixing in forced stratified turbulence and its dependence on large-scale forcing
    Howland, Christopher J.
    Taylor, John R.
    Caulfield, C. P.
    JOURNAL OF FLUID MECHANICS, 2020, 898 (898)
  • [45] Signal classification for the integrative analysis of multiple sequences of large-scale multiple tests
    Xiang, Dongdong
    Zhao, Sihai Dave
    Cai, T. Tony
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2019, 81 (04) : 707 - 734
  • [46] r-Power for Multiple Hypotheses Testing under Dependence
    Chakraborty, Swarnita
    Sijuwade, Adebowale
    Dasgupta, Nairanjana
    STATISTICS AND APPLICATIONS, 2024, 22 (03): : 429 - 448
  • [47] ASYMPTOTIC OPTIMALITY OF THE WESTFALL-YOUNG PERMUTATION PROCEDURE FOR MULTIPLE TESTING UNDER DEPENDENCE
    Meinshausen, Nicolai
    Maathuis, Marloes H.
    Buehlmann, Peter
    ANNALS OF STATISTICS, 2011, 39 (06) : 3369 - 3391
  • [48] MixTwice: large-scale hypothesis testing for peptide arrays by variance mixing
    Zheng, Zihao
    Mergaert, Aisha M.
    Ong, Irene M.
    Shelef, Miriam A.
    Newton, Michael A.
    BIOINFORMATICS, 2021, 37 (17) : 2637 - 2643
  • [49] Large-Scale Triaxial Testing of TDA Mixed with Fine and Coarse Aggregates
    El Naggar, Hany
    Ashari, Mohammad
    BUILDINGS, 2023, 13 (01)
  • [50] STUDY DESIGNS Statistical power and significance testing in large-scale genetic studies
    Sham, Pak C.
    Purcell, Shaun M.
    NATURE REVIEWS GENETICS, 2014, 15 (05) : 335 - 346