Minimum required number of specimen records to develop accurate species distribution models

被引:535
作者
van Proosdij, Andre S. J. [1 ]
Sosef, Marc S. M. [1 ,4 ]
Wieringa, Jan J. [1 ]
Raes, Niels [2 ,3 ]
机构
[1] Wageningen Univ, Biosystemat Grp, Droevendaalsesteeg 1, NL-6708 PB Wageningen, Netherlands
[2] Naturalis Biodivers Ctr, Bot Sect, ASJvP, Darwinweg 2, NL-2333 CR Leiden, Netherlands
[3] Naturalis Biodivers Ctr, Bot Sect, JJW, Darwinweg 2, NL-2333 CR Leiden, Netherlands
[4] Bot Garden Meise, Nieuwelaan 38, BE-1860 Meise, Belgium
关键词
SAMPLE-SIZE; HERBARIUM COLLECTIONS; BIAS; PERFORMANCE; NICHE; DIVERSITY; RELIABILITY; PREDICTION; PRESENCES; SELECTION;
D O I
10.1111/ecog.01509
中图分类号
X176 [生物多样性保护];
学科分类号
090705 ;
摘要
Species distribution models (SDMs) are widely used to predict the occurrence of species. Because SDMs generally use presence-only data, validation of the predicted distribution and assessing model accuracy is challenging. Model performance depends on both sample size and species' prevalence, being the fraction of the study area occupied by the species. Here, we present a novel method using simulated species to identify the minimum number of records required to generate accurate SDMs for taxa of different pre-defined prevalence classes. We quantified model performance as a function of sample size and prevalence and found model performance to increase with increasing sample size under constant prevalence, and to decrease with increasing prevalence under constant sample size. The area under the curve (AUC) is commonly used as a measure of model performance. However, when applied to presence-only data it is prevalence-dependent and hence not an accurate performance index. Testing the AUC of an SDM for significant deviation from random performance provides a good alternative. We assessed the minimum number of records required to obtain good model performance for species of different prevalence classes in a virtual study area and in a real African study area. The lower limit depends on the species' prevalence with absolute minimum sample sizes as low as 3 for narrow-ranged and 13 for widespread species for our virtual study area which represents an ideal, balanced, orthogonal world. The lower limit of 3, however, is flawed by statistical artefacts related to modelling species with a prevalence below 0.1. In our African study area lower limits are higher, ranging from 14 for narrow-ranged to 25 for widespread species. We advocate identifying the minimum sample size for any species distribution modelling by applying the novel method presented here, which is applicable to any taxonomic clade or group, study area or climate scenario.
引用
收藏
页码:542 / 552
页数:11
相关论文
共 80 条
[1]   Similar but not equivalent: ecological niche comparison across closely-related Mexican white pines [J].
Aguirre-Gutierrez, Jesus ;
Serna-Chavez, Hector M. ;
Villalobos-Arambula, Alma R. ;
Perez de la Rosa, Jorge A. ;
Raes, Niels .
DIVERSITY AND DISTRIBUTIONS, 2015, 21 (03) :245-257
[2]   Fit-for-Purpose: Species Distribution Model Performance Depends on Evaluation Criteria - Dutch Hoverflies as a Case Study [J].
Aguirre-Gutierrez, Jesus ;
Carvalheiro, Luisa G. ;
Polce, Chiara ;
van Loon, E. Emiel ;
Raes, Niels ;
Reemer, Menno ;
Biesmeijer, Jacobus C. .
PLOS ONE, 2013, 8 (05)
[3]   Predicting the future of species diversity: macroecological theory, climate change, and direct tests of alternative forecasting methods [J].
Algar, Adam C. ;
Kharouba, Heather M. ;
Young, Eric R. ;
Kerr, Jeremy T. .
ECOGRAPHY, 2009, 32 (01) :22-33
[4]  
[Anonymous], 2004, IUCN red list categories and criteria
[5]  
[Anonymous], HARM WORLD SOIL DAT
[6]   Validation of species-climate impact models under climate change [J].
Araújo, MB ;
Pearson, RG ;
Thuiller, W ;
Erhard, M .
GLOBAL CHANGE BIOLOGY, 2005, 11 (09) :1504-1513
[7]   Uses and misuses of bioclimatic envelope modeling [J].
Araujo, Miguel B. ;
Townsend Peterson, A. .
ECOLOGY, 2012, 93 (07) :1527-1539
[8]   Evaluation of statistical models used for predicting plant species distributions: Role of artificial data and theory [J].
Austin, M. P. ;
Belbin, L. ;
Meyers, J. A. ;
Doherty, M. D. ;
Luoto, M. .
ECOLOGICAL MODELLING, 2006, 199 (02) :197-216
[9]   The crucial role of the accessible area in ecological niche modeling and species distribution modeling [J].
Barve, Narayani ;
Barve, Vijay ;
Jimenez-Valverde, Alberto ;
Lira-Noriega, Andres ;
Maher, Sean P. ;
Peterson, A. Townsend ;
Soberon, Jorge ;
Villalobos, Fabricio .
ECOLOGICAL MODELLING, 2011, 222 (11) :1810-1819
[10]   The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models [J].
Bean, William T. ;
Stafford, Robert ;
Brashares, Justin S. .
ECOGRAPHY, 2012, 35 (03) :250-258