Selection of significant input variables for time series forecasting

被引:37
作者
Tran, H. D. [1 ]
Muttil, N. [2 ,3 ]
Perera, B. J. C. [2 ,3 ]
机构
[1] RMIT Univ, Sch Civil Environm & Chem Engn, Melbourne, Vic 3001, Australia
[2] Victoria Univ, Coll Engn & Sci, Melbourne, Vic 8001, Australia
[3] Victoria Univ, Inst Sustainabil & Innovat, Melbourne, Vic 8001, Australia
关键词
Time series forecasting; Input variable selection; Genetic programming; Partial mutual information; Correlation; ARTIFICIAL NEURAL-NETWORKS; PARTIAL MUTUAL INFORMATION; DATA-DRIVEN MODELS; ALGORITHM; FRAMEWORK;
D O I
10.1016/j.envsoft.2014.11.018
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Appropriate selection of inputs for time series forecasting models is important because it not only has the potential to improve performance of forecasting models, but also helps reducing cost in data collection. This paper presents an investigation of selection performance of three input selection techniques, which include two model-free techniques, partial linear correlation (PLC) and partial mutual information (PMI) and a model-based technique based on genetic programming (GP). Four hypothetical datasets and two real datasets were used to demonstrate the performance of the three techniques. The results suggested that the model-free PLC technique due to its computational simplicity and the model-based GP technique due to its ability to detect non-linear relationships (demonstrated by its relatively good performance on a hypothetical complex non-linear dataset) are recommended for the input selection task. Candidate inputs which are selected by both these recommended techniques should be considered as significant inputs. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:156 / 163
页数:8
相关论文
共 27 条
  • [1] Input selection and optimisation for monthly rainfall forecasting in Queensland, Australia, using artificial neural networks
    Abbot, John
    Marohasy, Jennifer
    [J]. ATMOSPHERIC RESEARCH, 2014, 138 : 166 - 178
  • [2] A two-stage evolutionary algorithm for variable selection in the development of RBF neural network models
    Alexandridis, A
    Patrinos, P
    Sarimveis, H
    Tsekouras, G
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2005, 75 (02) : 149 - 162
  • [3] [Anonymous], 2024, P INT SCI CONFERENCE
  • [4] Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality
    Austin, PC
    Tu, JV
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2004, 57 (11) : 1138 - 1146
  • [5] Babovic V., 2000, Journal of Hydroinformatics, V1, P35, DOI DOI 10.2166/HYDRO.2000.0004
  • [6] Jordan recurrent neural network versus IHACRES in modelling daily streamflows
    Carcano, Elena Carta
    Bartolini, Paolo
    Muselli, Marco
    Piroddi, Luigi
    [J]. JOURNAL OF HYDROLOGY, 2008, 362 (3-4) : 291 - 307
  • [7] Input space to neural network based load forecasters
    da Silva, Alexandre P. Alves
    Ferreira, Vitor H.
    Velasquez, Roberto M. G.
    [J]. INTERNATIONAL JOURNAL OF FORECASTING, 2008, 24 (04) : 616 - 629
  • [8] Selection of input variables for data driven models: An average shifted histogram partial mutual information estimator approach
    Fernando, T. M. K. G.
    Maier, H. R.
    Dandy, G. C.
    [J]. JOURNAL OF HYDROLOGY, 2009, 367 (3-4) : 165 - 176
  • [9] Tree-based iterative input variable selection for hydrological modeling
    Galelli, S.
    Castelletti, A.
    [J]. WATER RESOURCES RESEARCH, 2013, 49 (07) : 4295 - 4310
  • [10] An evaluation framework for input variable selection algorithms for environmental data-driven models
    Galelli, Stefano
    Humphrey, Greer B.
    Maier, Holger R.
    Castelletti, Andrea
    Dandy, Graeme C.
    Gibbs, Matthew S.
    [J]. ENVIRONMENTAL MODELLING & SOFTWARE, 2014, 62 : 33 - 51