Heteroscedasticity-Adjusted Ranking and Thresholding for Large-Scale Multiple Testing

被引:2
|
作者
Fu, Luella [1 ]
Gang, Bowen [2 ]
James, Gareth M. [3 ]
Sun, Wenguang [3 ]
机构
[1] San Francisco State Univ, Dept Math, San Francisco, CA 94132 USA
[2] Fudan Univ, Dept Stat, Shanghai, Peoples R China
[3] Univ Southern Calif, Dept Data Sci & Operat, Los Angeles, CA 90089 USA
关键词
Covariate-assisted inference; Data processing and information loss; False discovery rate; Heteroscedasticity; Multiple testing with side information; Structured multiple testing; FALSE-DISCOVERY RATE; GENE-EXPRESSION; EMPIRICAL BAYES; POWER; HYPOTHESES; NULL; MICROARRAYS;
D O I
10.1080/01621459.2020.1840992
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Standardization has been a widely adopted practice in multiple testing, for it takes into account the variability in sampling and makes the test statistics comparable across different study units. However, despite conventional wisdom to the contrary, we show that there can be a significant loss in information from basing hypothesis tests on standardized statistics rather than the full data. We develop a new class of heteroscedasticity-adjusted ranking and thresholding (HART) rules that aim to improve existing methods by simultaneously exploiting commonalities and adjusting heterogeneities among the study units. The main idea of HART is to bypass standardization by directly incorporating both the summary statistic and its variance into the testing procedure. A key message is that the variance structure of the alternative distribution, which is subsumed under standardized statistics, is highly informative and can be exploited to achieve higher power. The proposed HART procedure is shown to be asymptotically valid and optimal for false discovery rate (FDR) control. Our simulation results demonstrate that HART achieves substantial power gain over existing methods at the same FDR level. We illustrate the implementation through a microarray analysis of myeloma.
引用
收藏
页码:1028 / 1040
页数:13
相关论文
共 50 条
  • [31] Large-scale dependent multiple testing via higher-order hidden Markov models
    Li, Canhui
    Wang, Jiangzhou
    Wang, Pengfei
    JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2024,
  • [32] Semi-parametric hidden Markov model for large-scale multiple testing under dependency
    Kim, Joungyoun
    Lim, Johan
    Lee, Jong Soo
    STATISTICAL MODELLING, 2024, 24 (04) : 320 - 343
  • [33] LARGE-SCALE MULTIPLE INFERENCE OF COLLECTIVE DEPENDENCE WITH APPLICATIONS TO PROTEIN FUNCTION
    Jernigan, Robert
    Jia, Kejue
    Ren, Zhao
    Zhou, Wen
    ANNALS OF APPLIED STATISTICS, 2021, 15 (02) : 902 - 924
  • [34] Directional false discovery rate control in large-scale multiple comparisons
    Liang, Wenjuan
    Xiang, Dongdong
    Mei, Yajun
    Li, Wendong
    JOURNAL OF APPLIED STATISTICS, 2024, 51 (15) : 3195 - 3214
  • [35] ACCOUNTING FOR TIME DEPENDENCE IN LARGE-SCALE MULTIPLE TESTING OF EVENT-RELATED POTENTIAL DATA
    Sheu, Ching-Fan
    Perthame, Emeline
    Lee, Yuh-Shiow
    Causeur, David
    ANNALS OF APPLIED STATISTICS, 2016, 10 (01) : 219 - 245
  • [36] Large-Scale Global and Simultaneous Inference: Estimation and Testing in Very High Dimensions
    Cai, T. Tony
    Sun, Wenguang
    ANNUAL REVIEW OF ECONOMICS, VOL 9, 2017, 9 : 411 - 439
  • [37] Dimension constraints improve hypothesis testing for large-scale, graph-associated, brain-image data
    Vo, Tien
    Mishra, Akshay
    Ithapu, Vamsi
    Singh, Vikas
    Newton, Michael A.
    BIOSTATISTICS, 2022, 23 (03) : 860 - 874
  • [38] Large-scale multiple hypothesis testing with the normal-beta prime prior
    Bai, Ray
    Ghosh, Malay
    STATISTICS, 2019, 53 (06) : 1210 - 1233
  • [39] A computational framework for complex disease stratification from multiple large-scale datasets
    De Meulder, Bertrand
    Lefaudeux, Diane
    Bansal, Aruna T.
    Mazein, Alexander
    Chaiboonchoe, Amphun
    Ahmed, Hassan
    Balaur, Irina
    Saqi, Mansoor
    Pellet, Johann
    Ballereau, Stephane
    Lemonnier, Nathanael
    Sun, Kai
    Pandis, Ioannis
    Yang, Xian
    Batuwitage, Manohara
    Kretsos, Kosmas
    van Eyll, Jonathan
    Bedding, Alun
    Davison, Timothy
    Dodson, Paul
    Larminie, Christopher
    Postle, Anthony
    Corfield, Julie
    Djukanovic, Ratko
    Chung, Kian Fan
    Adcock, Ian M.
    Guo, Yi-Ke
    Sterk, Peter J.
    Manta, Alexander
    Rowe, Anthony
    Baribaud, Frederic
    Auffray, Charles
    BMC SYSTEMS BIOLOGY, 2018, 12
  • [40] Fast and covariate-adaptive method amplifies detection power in large-scale multiple hypothesis testing
    Zhang, Martin J.
    Xia, Fei
    Zou, James
    NATURE COMMUNICATIONS, 2019, 10 (1)