Evaluation Procedures for Forecasting with Spatio-Temporal Data

被引:3
作者
Oliveira, Mariana [1 ,3 ]
Torgo, Luis [1 ,2 ,3 ]
Costa, Vitor Santos [1 ,3 ]
机构
[1] Univ Porto, Porto, Portugal
[2] Dalhousie Univ, Halifax, NS, Canada
[3] INESC TEC, Porto, Portugal
来源
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT I | 2019年 / 11051卷
关键词
Evaluation methods; Performance estimation; Cross-validation; Spatio-temporal data; Geo-referenced time series; Reproducible research; CROSS-VALIDATION; INTERPOLATION; REGRESSION; WATER;
D O I
10.1007/978-3-030-10925-7_43
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The amount of available spatio-temporal data has been increasing as large-scale data collection (e.g., from geosensor networks) becomes more prevalent. This has led to an increase in spatio-temporal forecasting applications using geo-referenced time series data motivated by important domains such as environmental monitoring (e.g., air pollution index, forest fire risk prediction). Being able to properly assess the performance of new forecasting approaches is fundamental to achieve progress. However, the dependence between observations that the spatio-temporal context implies, besides being challenging in the modelling step, also raises issues for performance estimation as indicated by previous work. In this paper, we empirically compare several variants of cross-validation (CV) and out-of-sample (OOS) performance estimation procedures that respect data ordering, using both artificially generated and real-world spatio-temporal data sets. Our results show both CV and OOS reporting useful estimates. Further, they suggest that blocking may be useful in addressing CV's bias to underestimate error. OOS can be very sensitive to test size, as expected, but estimates can be improved by careful management of the temporal dimension in training.
引用
收藏
页码:703 / 718
页数:16
相关论文
共 35 条
[1]  
[Anonymous], 1988, On model uncertainty and its statistical implications, DOI [DOI 10.1007/978-3-642-61564-1_4, DOI 10.1007/978-3-642-61564-14]
[2]  
[Anonymous], 2017, R LANG ENV STAT COMP
[3]  
[Anonymous], 2002, ANAL LONGITUDINAL DA
[4]  
Appice Annalisa, 2013, AI*IA 2013: Advances in Artificial Intelligence. XIIIth International Conference of the Italian Association for Artificial Intelligence. Proceedings: LNCS 8249, P433, DOI 10.1007/978-3-319-03524-6_37
[5]   A survey of cross-validation procedures for model selection [J].
Arlot, Sylvain ;
Celisse, Alain .
STATISTICS SURVEYS, 2010, 4 :40-79
[6]   On the usefulness of cross-validation for directional forecast evaluation [J].
Bergmeir, Christoph ;
Costantini, Mauro ;
Benitez, Jose M. .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 76 :132-143
[7]   On the use of cross-validation for time series predictor evaluation [J].
Bergmeir, Christoph ;
Benitez, Jose M. .
INFORMATION SCIENCES, 2012, 191 :192-213
[8]  
BURMAN P, 1994, BIOMETRIKA, V81, P351, DOI 10.1093/biomet/81.2.351
[9]  
Calvo B, 2016, R J, V8, P248
[10]   Spatial modeling of snow water equivalent using covariances estimated from spatial and geomorphic attributes [J].
Carroll, SS ;
Cressie, N .
JOURNAL OF HYDROLOGY, 1997, 190 (1-2) :42-59