The area under the precision-recall curve as a performance metric for rare binary events

被引:200
作者
Sofaer, Helen R. [1 ]
Hoeting, Jennifer A. [2 ]
Jarnevich, Catherine S. [1 ]
机构
[1] US Geol Survey, Ft Collins Sci Ctr, Ft Collins, CO 80526 USA
[2] Colorado State Univ, Dept Stat, Ft Collins, CO 80523 USA
来源
METHODS IN ECOLOGY AND EVOLUTION | 2019年 / 10卷 / 04期
关键词
discrimination performance; model assessment; performance metric; receiver operating characteristic curve; species distribution modelling; virtual species; SPECIES DISTRIBUTION MODELS; HABITAT SUITABILITY MODELS; PRESENCE-ABSENCE MODELS; PREDICTIVE PERFORMANCE; PREVALENCE; DEPENDENCE; THRESHOLDS; CLIMATE; DISTRIBUTIONS; SELECTION;
D O I
10.1111/2041-210X.13140
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Species distribution models are used to study biogeographic patterns and guide decision-making. The variable quality of these models makes it critical to assess whether a model's outputs are suitable for the intended use, but commonly used evaluation approaches are inappropriate for many ecological contexts. In particular, unrealistically high performance assessments have been associated with models for rare species and predictions over large geographic extents. We evaluated the area under the precision-recall curve (AUC-PR) as a performance metric for rare binary events, focusing on the assessment of species distribution models. Precision is the probability that a species is present given a predicted presence, while recall (more commonly called sensitivity) is the probability the model predicts presence in locations where the species has been observed. We simulated species at three levels of prevalence, compared AUC-PR and the area under the receiver operating characteristic curve (AUC-ROC) when the geographic extent of predictions was increased and assessed how well each metric reflected a model's utility to guide surveys for new populations. AUC-PR was robust to species rarity and, unlike AUC-ROC, not affected by an increasing geographic extent. The major advantages of AUC-PR arise because it does not incorporate correctly predicted absences and is therefore less prone to exaggerate model performance for unbalanced datasets. AUC-PR and precision were useful indicators of a model's utility for guiding surveys. We show that AUC-PR has important advantages for evaluating models of rare species, and its benefits in the context of unbalanced binary responses will make it applicable for other ecological studies. By not considering the true negative quadrant of the confusion matrix, AUC-PR ameliorates issues that arise when the geographic extent is increased beyond the species' range or when a large number of background points are used when absence information is unavailable. However, no single metric captures all aspects of performance nor provides an absolute index that can be compared across datasets. Our results indicate AUC-PR and precision can provide useful and intuitive metrics for evaluating a model's utility for guiding sampling, and can complement other metrics to help delineate a model's appropriate use.
引用
收藏
页码:565 / 577
页数:13
相关论文
共 68 条
[51]   POC plots: calibrating species distribution models with presence-only data [J].
Phillips, Steven J. ;
Elith, Jane .
ECOLOGY, 2010, 91 (08) :2476-2484
[52]   Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data [J].
Phillips, Steven J. ;
Dudik, Miroslav ;
Elith, Jane ;
Graham, Catherine H. ;
Lehmann, Anthony ;
Leathwick, John ;
Ferrier, Simon .
ECOLOGICAL APPLICATIONS, 2009, 19 (01) :181-197
[53]   Point process models for presence-only analysis [J].
Renner, Ian W. ;
Elith, Jane ;
Baddeley, Adrian ;
Fithian, William ;
Hastie, Trevor ;
Phillips, Steven J. ;
Popovic, Gordana ;
Warton, David I. .
METHODS IN ECOLOGY AND EVOLUTION, 2015, 6 (04) :366-379
[54]   The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets [J].
Saito, Takaya ;
Rehmsmeier, Marc .
PLOS ONE, 2015, 10 (03)
[55]   Assessing the effect of prevalence on the predictive performance of species distribution models using simulated data [J].
Santika, Truly .
GLOBAL ECOLOGY AND BIOGEOGRAPHY, 2011, 20 (01) :181-192
[56]  
Schapire R. E., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P215, DOI 10.1145/290941.290996
[57]   On evaluating species distribution models with random background sites in place of absences when test presences disproportionately sample suitable habitat [J].
Smith, Adam B. .
DIVERSITY AND DISTRIBUTIONS, 2013, 19 (07) :867-872
[58]   Misleading prioritizations from modelling range shifts under climate change [J].
Sofaer, Helen R. ;
Jarnevich, Catherine S. ;
Flather, Curtis H. .
GLOBAL ECOLOGY AND BIOGEOGRAPHY, 2018, 27 (06) :658-666
[59]   Prevalence dependence in model goodness measures with special emphasis on true skill statistics [J].
Somodi, Imelda ;
Lepesi, Nikolett ;
Botta-Dukat, Zoltan .
ECOLOGY AND EVOLUTION, 2017, 7 (03) :863-872
[60]   Climate and very large wildland fires in the contiguous western USA [J].
Stavros, E. Natasha ;
Abatzoglou, John ;
Larkin, Narasimhan K. ;
McKenzie, Donald ;
Steel, E. Ashley .
INTERNATIONAL JOURNAL OF WILDLAND FIRE, 2014, 23 (07) :899-914