Large-scale simultaneous hypothesis testing: The choice of a null hypothesis

被引:660
作者
Efron, B [1 ]
机构
[1] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
关键词
empirical Bayes; empirical null hypothesis; local false discovery rate; microarray analysis; unobserved covariates;
D O I
10.1198/016214504000000089
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Current scientific techniques in genomics and image processing routinely produce hypothesis testing problems with hundreds or thousands of cases to consider simultaneously. This poses new difficulties for the statistician, but also opens new opportunities. In particular, it allows empirical estimation of an appropriate null hypothesis. The empirical null may be considerably more dispersed than the usual theoretical null distribution that would be used for any one case considered separately. An empirical Bayes analysis plan for this situation is developed, using a local version of the false discovery rate to examine the inference issues. Two genomics problems are used as examples to show the importance of correctly choosing the null hypothesis.
引用
收藏
页码:96 / 104
页数:9
相关论文
共 50 条
[41]   Non-parametric comparison and classification of two large-scale populations [J].
Ghoreishi, S. K. ;
Wu, Jingjing ;
Ghoreishi, Ghazal S. .
JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2023, 52 (01) :234-247
[42]   Learning large-scale graphical Gaussian models from genomic data [J].
Schäfer, J ;
Strimmer, K .
SCIENCE OF COMPLEX NETWORKS: FROM BIOLOGY TO THE INTERNET AND WWW, 2005, 776 :263-276
[43]   Multiple hypothesis testing and clustering with mixtures of non-central t-distributions applied in microarray data analysis [J].
Marin, J. M. ;
Rodriguez-Bernal, M. T. .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (06) :1898-1907
[44]   NONPARAMETRIC ESTIMATION OF THE DENSITY OF THE ALTERNATIVE HYPOTHESIS IN A MULTIPLE TESTING SETUP. APPLICATION TO LOCAL FALSE DISCOVERY RATE ESTIMATION [J].
Van Hanh Nguyen ;
Matias, Catherine .
ESAIM-PROBABILITY AND STATISTICS, 2014, 18 :584-612
[45]   An approximate empirical Bayesian method for large-scale linear-Gaussian inverse problems [J].
Zhou, Qingping ;
Liu, Wenqing ;
Li, Jinglai ;
Marzouk, Youssef M. .
INVERSE PROBLEMS, 2018, 34 (09)
[46]   Lessons learned from the large-scale application of Driver Feedback Signs in an urban city [J].
Wu, Mingjian ;
El-Basyouny, Karim ;
Kwon, Tae J. .
JOURNAL OF TRANSPORTATION SAFETY & SECURITY, 2021, 13 (12) :1283-1301
[47]   Shrinkage Estimation of Effect Sizes as an Alternative to Hypothesis Testing Followed by Estimation in High-Dimensional Biology: Applications to Differential Gene Expression [J].
Montazeri, Zahra ;
Yanofsky, Corey M. ;
Bickel, David R. .
STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2010, 9 (01)
[48]   Large-Scale Expression Analysis Reveals Distinct MicroRNA Profiles at Different Stages of Human Neurodevelopment [J].
Smith, Brandon ;
Treadwell, Julie ;
Zhang, Dongling ;
Ly, Dao ;
McKinnell, Iain ;
Walker, P. Roy ;
Sikorska, Marianna .
PLOS ONE, 2010, 5 (06)
[49]   Testing a Large Number of Composite Null Hypotheses Using Conditionally Symmetric Multidimensional Gaussian Mixtures in Genome-Wide Studies [J].
Sun, Ryan ;
Mccaw, Zachary R. ;
Lin, Xihong .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2025, 120 (550) :605-617
[50]   Large-scale gene expression profiling data of bone marrow stromal cells from osteoarthritic donors [J].
Stiehler, Maik ;
Rauh, Juliane ;
Bunger, Cody ;
Jacobi, Angela ;
Vater, Corina ;
Schildberg, Theresa ;
Liebers, Cornelia ;
Guenther, Klaus-Peter ;
Bretschneider, Henriette .
DATA IN BRIEF, 2016, 8 :545-548