Bayesian and frequentist testing for differences between two groups with parametric and nonparametric two-sample tests

被引:21
作者
Kelter, Riko [1 ]
机构
[1] Univ Siegen, Dept Math, D-57072 Siegen, Nrw, Germany
来源
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS | 2021年 / 13卷 / 06期
关键词
Bayesian two-sample tests; Mann-Whitney U test; null hypothesis significance testing; Student's t-test; testing for differences between two groups; HYPOTHESIS; DISTRIBUTIONS; STATISTICS; MOVEMENTS; MIXTURES; VALUES;
D O I
10.1002/wics.1523
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Testing for differences between two groups is one of the scenarios most often faced by scientists across all domains and is particularly important in the medical sciences and psychology. The traditional solution to this problem is rooted inside the Neyman-Pearson theory of null hypothesis significance testing and uniformly most powerful tests. In the last decade, a lot of progress has been made in developing Bayesian versions of the most common parametric and nonparametric two-sample tests, including Student's t-test and the Mann-Whitney U test. In this article, we review the underlying assumptions, models and implications for research practice of these Bayesian two-sample tests and contrast them with the existing frequentist solutions. Also, we show that in general Bayesian and frequentist two-sample tests have different behavior regarding the type I and II error control, which needs to be carefully balanced in practical research.
引用
收藏
页数:29
相关论文
共 73 条
  • [1] [Anonymous], 2006, Theory of Point Estimation, DOI 10.1007/b98854
  • [2] [Anonymous], 1948, Theory of Probability
  • [3] [Anonymous], 2014, BAYESIAN ESSENTIALS
  • [4] [Anonymous], 1988, LIKELIHOOD PRINCIPLE
  • [5] [Anonymous], 2019, R LANGUAGE ENV STAT
  • [6] Redefine statistical significance
    Benjamin, Daniel J.
    Berger, James O.
    Johannesson, Magnus
    Nosek, Brian A.
    Wagenmakers, E. -J.
    Berk, Richard
    Bollen, Kenneth A.
    Brembs, Bjoern
    Brown, Lawrence
    Camerer, Colin
    Cesarini, David
    Chambers, Christopher D.
    Clyde, Merlise
    Cook, Thomas D.
    De Boeck, Paul
    Dienes, Zoltan
    Dreber, Anna
    Easwaran, Kenny
    Efferson, Charles
    Fehr, Ernst
    Fidler, Fiona
    Field, Andy P.
    Forster, Malcolm
    George, Edward I.
    Gonzalez, Richard
    Goodman, Steven
    Green, Edwin
    Green, Donald P.
    Greenwald, Anthony
    Hadfield, Jarrod D.
    Hedges, Larry V.
    Held, Leonhard
    Ho, Teck Hua
    Hoijtink, Herbert
    Hruschka, Daniel J.
    Imai, Kosuke
    Imbens, Guido
    Ioannidis, John P. A.
    Jeon, Minjeong
    Jones, James Holland
    Kirchler, Michael
    Laibson, David
    List, John
    Little, Roderick
    Lupia, Arthur
    Machery, Edouard
    Maxwell, Scott E.
    McCarthy, Michael
    Moore, Don
    Morgan, Stephen L.
    [J]. NATURE HUMAN BEHAVIOUR, 2018, 2 (01): : 6 - 10
  • [7] Berger JO, 1997, STAT SCI, V12, P133
  • [8] Could Fisher, Jeffreys and Neyman have agreed on testing?
    Berger, JO
    [J]. STATISTICAL SCIENCE, 2003, 18 (01) : 1 - 12
  • [9] A UNIFIED CONDITIONAL FREQUENTIST AND BAYESIAN TEST FOR FIXED AND SEQUENTIAL SIMPLE HYPOTHESIS-TESTING
    BERGER, JO
    BROWN, LD
    WOLPERT, RL
    [J]. ANNALS OF STATISTICS, 1994, 22 (04) : 1787 - 1807
  • [10] Stan: A Probabilistic Programming Language
    Carpenter, Bob
    Gelman, Andrew
    Hoffman, Matthew D.
    Lee, Daniel
    Goodrich, Ben
    Betancourt, Michael
    Brubaker, Marcus A.
    Guo, Jiqiang
    Li, Peter
    Riddell, Allen
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2017, 76 (01): : 1 - 29