A Comparative Study of Performance Estimation Methods for Time Series Forecasting

被引:27
作者
Cerqueira, Vitor [1 ,2 ]
Torgo, Luis [1 ,2 ]
Smailovic, Jasmina [3 ]
Mozetic, Igor [3 ]
机构
[1] LIAAD INESCTEC, Porto, Portugal
[2] Univ Porto, Porto, Portugal
[3] Jozef Stefan Inst, Jamova 39, Ljubljana 1000, Slovenia
来源
2017 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA) | 2017年
基金
欧盟地平线“2020”;
关键词
performance estimation; model selection; cross validation; time series; CROSS-VALIDATION; MODEL; SELECTION;
D O I
10.1109/DSAA.2017.7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Performance estimation denotes a task of estimating the loss that a predictive model will incur on unseen data. These procedures are part of the pipeline in every machine learning task and are used for assessing the overall generalisation ability of models. In this paper we address the application of these methods to time series forecasting tasks. For independent and identically distributed data the most common approach is cross-validation. However, the dependency among observations in time series raises some caveats about the most appropriate way to estimate performance in these datasets and currently there is no settled way to do so. We compare different variants of cross-validation and different variants of out-of-sample approaches using two case studies: One with 53 real-world time series and another with three synthetic time series. Results show noticeable differences in the performance estimation methods in the two scenarios. In particular, empirical experiments suggest that cross-validation approaches can be applied to stationary synthetic time series. However, in real-world scenarios the most accurate estimates are produced by the out-of-sample methods, which preserve the temporal order of observations.
引用
收藏
页码:529 / 538
页数:10
相关论文
共 31 条
  • [1] A novel Hybrid RBF Neural Networks model as a forecaster
    Akbilgic, Oguz
    Bozdogan, Hamparsum
    Balaban, M. Erdal
    [J]. STATISTICS AND COMPUTING, 2014, 24 (03) : 365 - 375
  • [2] Andreas S., 2007, DA550056515 NREL, DOI [10.5439/1052559, DOI 10.5439/1052559]
  • [3] A survey of cross-validation procedures for model selection
    Arlot, Sylvain
    Celisse, Alain
    [J]. STATISTICS SURVEYS, 2010, 4 : 40 - 79
  • [4] Bache K., 2013, UCI Machine Learning Repository
  • [5] Bergmeir C., 2011, Proceedings of the 2011 11th International Conference on Intelligent Systems Design and Applications (ISDA), P849, DOI 10.1109/ISDA.2011.6121763
  • [6] Bergmeir C., 2015, A Note on the Validity of Cross-Validation for Evaluating Time Series Prediction A Note on the Validity of Cross-Validation for Evaluating Time Series Prediction
  • [7] On the usefulness of cross-validation for directional forecast evaluation
    Bergmeir, Christoph
    Costantini, Mauro
    Benitez, Jose M.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 76 : 132 - 143
  • [8] On the use of cross-validation for time series predictor evaluation
    Bergmeir, Christoph
    Benitez, Jose M.
    [J]. INFORMATION SCIENCES, 2012, 191 : 192 - 213
  • [9] Bifet A., 2009, DATA STREAM MINING P
  • [10] Brockwell P. J., 2013, TIME SERIES THEORY M