Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation

被引:628
作者
Mason, SJ [1 ]
Graham, NE [1 ]
机构
[1] Univ Calif San Diego, Scripps Inst Oceanog, Div Climate Res, La Jolla, CA 92093 USA
关键词
forecast verification Mann-Whitney U-test probabilistic forecasts; signal-detection theory; student's t-test;
D O I
10.1256/003590002320603584
中图分类号
P4 [大气科学(气象学)];
学科分类号
0706 ; 070601 ;
摘要
The areas beneath the relative (or receiver) operating characteristics (ROC) and relative operating levels (ROL) curves can be used as summary measures of forecast quality, but statistical significance tests for these areas are conducted infrequently in the atmospheric sciences. A development of signal-detection theory, the ROC curve has been widely applied in the medical and psychology fields where significance tests and relationships to other common statistical methods have been established and described. This valuable literature appears to be largely unknown to the atmospheric sciences where applications of ROC and related techniques are becoming more common. This paper presents a survey of that literature with a focus on the interpretation of the ROC area in the field of forecast verification. We extend these foundations to demonstrate that similar principles can be applied to the interpretation and significance testing of the ROL area. It is shown that the ROC area is equivalent to the Mann-Whitney U-statistic testing the significance of forecast event probabilities for cases where events actually occur-red with those where events did not occur. A similar derivation shows that the ROL area is equivalent to the Mann-Whitney U-statistic testing the magnitude of events with respect to whether or not an event has been forecast. Because the Mann-Whitney U-statistic follows a known probability distribution, under certain assumptions it can be used to define the statistical significance of ROC and ROL areas and for comparing the areas of competing forecasts. For large samples the significance of either measure can be accurately assessed using a normal-distribution approximation.
引用
收藏
页码:2145 / 2166
页数:22
相关论文
共 79 条
[1]  
[Anonymous], 2000, ADV MANAGEMENT ORG Q
[2]   AREA ABOVE ORDINAL DOMINANCE GRAPH AND AREA BELOW RECEIVER OPERATING CHARACTERISTIC GRAPH [J].
BAMBER, D .
JOURNAL OF MATHEMATICAL PSYCHOLOGY, 1975, 12 (04) :387-415
[3]   ADVANCES IN STATISTICAL METHODOLOGY FOR DIAGNOSTIC MEDICINE IN THE 1980S [J].
BEGG, CB .
STATISTICS IN MEDICINE, 1991, 10 (12) :1887-1895
[4]   AN APPROXIMATION TO WILCOXON-MANN-WHITNEY DISTRIBUTION [J].
BUCKLE, N ;
KRAFT, C ;
VANEEDEN, C .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1969, 64 (326) :591-&
[5]  
Buizza R, 1999, WEATHER FORECAST, V14, P168, DOI 10.1175/1520-0434(1999)014<0168:PPOPUT>2.0.CO
[6]  
2
[7]  
Buizza R, 1998, Q J ROY METEOR SOC, V124, P1935, DOI 10.1002/qj.49712455008
[8]  
Buizza R, 1998, MON WEATHER REV, V126, P2503, DOI 10.1175/1520-0493(1998)126<2503:IOESOE>2.0.CO
[9]  
2
[10]  
Buizza R, 2001, MON WEATHER REV, V129, P2329, DOI 10.1175/1520-0493(2001)129<2329:AAPEVO>2.0.CO