The Wilcoxon-Mann-Whitney test under scrutiny

被引:151
作者
Fagerland, Morten W. [1 ]
Sandvik, Leiv [1 ]
机构
[1] Oslo Univ Hosp, Ulleval Dept Res Adm, Oslo, Norway
关键词
Wilcoxon-Mann-Whitney test; Fligner-Policello test; Brunner-Munzel test; rank transformation; robustness; heteroscedasticity; 2-SAMPLE T-TEST; RANK TRANSFORMATIONS; ROBUSTNESS; POWER;
D O I
10.1002/sim.3561
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The Wilcoxon-Mann-Whitney (WMW) test is often used to compare the means or medians of two independent, possibly nonnormal distributions. For this problem, the true significance level of the large sample approximate version of the WMW test is known to be sensitive to differences in the shapes of the distributions. Based on a wide ranging simulation study, our paper shows that the problem of lack of roubustness of this test is more serious than is thought to be the case. In particular, small differences in variances and moderate degrees of skewness can produce large deviations from the nominal type 1 error rate. This is further exacerbated when the two distributions have different degrees of skewness. Other rank-based based methods like the Fligner-Policello (FP) test and the Brunner-Munzel (BM) test perform similarly, although the BM test is generally better. By considering the WMW test as a two-sample T test on ranks, we explain the results by noting some undesirable properties of the rank transformation. In practice, the ranked samples should be examined and found to sufficiently satisfy reasonable symmetry and variance homogeneity before the test results are interpreted. Copyright (C) 2009 John Wiley & Sons, Ltd.
引用
收藏
页码:1487 / 1497
页数:11
相关论文
共 23 条
[1]  
[Anonymous], J EXP ED, DOI DOI 10.1080/00220973.1996.10806603
[2]  
[Anonymous], 2006, SPSS 15 0
[3]  
[Anonymous], 2000, WILEY SERIES PROBABI
[4]   Increasing physicians' awareness of the impact of statistics on research outcomes: Comparative power of the t-test and Wilcoxon rank-sum test in small samples applied research [J].
Bridge, PD ;
Sawilowsky, SS .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 1999, 52 (03) :229-235
[5]  
Brunner E, 2000, BIOMETRICAL J, V42, P17, DOI 10.1002/(SICI)1521-4036(200001)42:1<17::AID-BIMJ17>3.0.CO
[6]  
2-U
[7]   RANK TRANSFORMATIONS AS A BRIDGE BETWEEN PARAMETRIC AND NONPARAMETRIC STATISTICS [J].
CONOVER, WJ ;
IMAN, RL .
AMERICAN STATISTICIAN, 1981, 35 (03) :124-129
[8]   ROBUST RANK PROCEDURES FOR THE BEHRENS-FISHER PROBLEM [J].
FLIGNER, MA ;
POLICELLO, GE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1981, 76 (373) :162-168
[9]   USE OF A PRELIMINARY TEST IN COMPARING 2 SAMPLE MEANS [J].
GANS, DJ .
COMMUNICATIONS IN STATISTICS PART B-SIMULATION AND COMPUTATION, 1981, 10 (02) :163-174
[10]   Mann-Whitney test is not just a test of medians: differences in spread can be important [J].
Hart, A .
BRITISH MEDICAL JOURNAL, 2001, 323 (7309) :391-393