Heteroscedasticity-Adjusted Ranking and Thresholding for Large-Scale Multiple Testing

被引：2

作者：

Fu, Luella ^{[1
]}

Gang, Bowen ^{[2
]}

James, Gareth M. ^{[3
]}

Sun, Wenguang ^{[3
]}

机构：

[1] San Francisco State Univ, Dept Math, San Francisco, CA 94132 USA

[2] Fudan Univ, Dept Stat, Shanghai, Peoples R China

[3] Univ Southern Calif, Dept Data Sci & Operat, Los Angeles, CA 90089 USA

来源：

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION | 2022年 / 117卷 / 538期

关键词：

Covariate-assisted inference; Data processing and information loss; False discovery rate; Heteroscedasticity; Multiple testing with side information; Structured multiple testing; FALSE-DISCOVERY RATE; GENE-EXPRESSION; EMPIRICAL BAYES; POWER; HYPOTHESES; NULL; MICROARRAYS;

D O I：

10.1080/01621459.2020.1840992

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Standardization has been a widely adopted practice in multiple testing, for it takes into account the variability in sampling and makes the test statistics comparable across different study units. However, despite conventional wisdom to the contrary, we show that there can be a significant loss in information from basing hypothesis tests on standardized statistics rather than the full data. We develop a new class of heteroscedasticity-adjusted ranking and thresholding (HART) rules that aim to improve existing methods by simultaneously exploiting commonalities and adjusting heterogeneities among the study units. The main idea of HART is to bypass standardization by directly incorporating both the summary statistic and its variance into the testing procedure. A key message is that the variance structure of the alternative distribution, which is subsumed under standardized statistics, is highly informative and can be exploited to achieve higher power. The proposed HART procedure is shown to be asymptotically valid and optimal for false discovery rate (FDR) control. Our simulation results demonstrate that HART achieves substantial power gain over existing methods at the same FDR level. We illustrate the implementation through a microarray analysis of myeloma.

引用

页码：1028 / 1040

页数：13

共 50 条

[31] Large-scale dependent multiple testing via higher-order hidden Markov models
Li, Canhui
Wang, Jiangzhou
Wang, Pengfei
JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2024,
[32] Semi-parametric hidden Markov model for large-scale multiple testing under dependency
Kim, Joungyoun
Lim, Johan
Lee, Jong Soo
STATISTICAL MODELLING, 2024, 24 (04) : 320 - 343
[33] LARGE-SCALE MULTIPLE INFERENCE OF COLLECTIVE DEPENDENCE WITH APPLICATIONS TO PROTEIN FUNCTION
Jernigan, Robert
Jia, Kejue
Ren, Zhao
Zhou, Wen
ANNALS OF APPLIED STATISTICS, 2021, 15 (02) : 902 - 924
[34] Directional false discovery rate control in large-scale multiple comparisons
Liang, Wenjuan
Xiang, Dongdong
Mei, Yajun
Li, Wendong
JOURNAL OF APPLIED STATISTICS, 2024, 51 (15) : 3195 - 3214
[35] ACCOUNTING FOR TIME DEPENDENCE IN LARGE-SCALE MULTIPLE TESTING OF EVENT-RELATED POTENTIAL DATA
Sheu, Ching-Fan
Perthame, Emeline
Lee, Yuh-Shiow
Causeur, David
ANNALS OF APPLIED STATISTICS, 2016, 10 (01) : 219 - 245
[36] Large-Scale Global and Simultaneous Inference: Estimation and Testing in Very High Dimensions
Cai, T. Tony
Sun, Wenguang
ANNUAL REVIEW OF ECONOMICS, VOL 9, 2017, 9 : 411 - 439
[37] Dimension constraints improve hypothesis testing for large-scale, graph-associated, brain-image data
Vo, Tien
Mishra, Akshay
Ithapu, Vamsi
Singh, Vikas
Newton, Michael A.
BIOSTATISTICS, 2022, 23 (03) : 860 - 874
[38] Large-scale multiple hypothesis testing with the normal-beta prime prior
Bai, Ray
Ghosh, Malay
STATISTICS, 2019, 53 (06) : 1210 - 1233
[39] A computational framework for complex disease stratification from multiple large-scale datasets
De Meulder, Bertrand
Lefaudeux, Diane
Bansal, Aruna T.
Mazein, Alexander
Chaiboonchoe, Amphun
Ahmed, Hassan
Balaur, Irina
Saqi, Mansoor
Pellet, Johann
Ballereau, Stephane
Lemonnier, Nathanael
Sun, Kai
Pandis, Ioannis
Yang, Xian
Batuwitage, Manohara
Kretsos, Kosmas
van Eyll, Jonathan
Bedding, Alun
Davison, Timothy
Dodson, Paul
Larminie, Christopher
Postle, Anthony
Corfield, Julie
Djukanovic, Ratko
Chung, Kian Fan
Adcock, Ian M.
Guo, Yi-Ke
Sterk, Peter J.
Manta, Alexander
Rowe, Anthony
Baribaud, Frederic
Auffray, Charles
BMC SYSTEMS BIOLOGY, 2018, 12
[40] Fast and covariate-adaptive method amplifies detection power in large-scale multiple hypothesis testing
Zhang, Martin J.
Xia, Fei
Zou, James
NATURE COMMUNICATIONS, 2019, 10 (1)

← 1 2 3 4 5 →