Temporal self-attention-based Conv-LSTM network for multivariate time series prediction

被引:74
作者
Fu, En [1 ]
Zhang, Yinong [2 ]
Yang, Fan [3 ]
Wang, Shuying [2 ]
机构
[1] Beijing Union Univ, Beijing Key Lab Informat Serv Engn, Beijing 100101, Peoples R China
[2] Beijing Union Univ, Coll Urban Rail Transit & Logist, Beijing 100101, Peoples R China
[3] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
关键词
Self-attention mechanism; Long short-term memory; Multivariate time series; Prediction;
D O I
10.1016/j.neucom.2022.06.014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Time series play an important role in many fields, such as industrial control, automated monitoring, and weather forecasting. Because there is often more than one variable in reality problems and they are related to each other, the multivariable time series (MTS) introduced. Using historical observations to accurately predict MTS is still very challenging. Therefore, a new time series prediction model proposed based on the temporal self-attention mechanism, convolutional neural network and long short-term memory (Conv-LSTM). When the standard attention mechanism for time series is combined with recurrent neural network (RNN), it heavily depends on the hidden state of the RNN. Particularly in the first time step, the initial hidden state (typically 0) must be artificially introduced to calculate the attention weight of that step, which results in additional noise in the calculation of the attention weight. To address this problem and increase the flexibility of the attention layer, a new self-attention mechanism designed to extract the temporal dependence of the MTS, which called temporal self-attention. In this attention mechanism, long short-term memory (LSTM) adopted as a sequence encoder to calculate the query, key, and value to obtain a more complete temporal dependence than standard self-attention. Because of flexibility of this structure, the DA-Conv-LSTM model was improved, in which a SOTA attention based method used for MTS prediction. Our improved model compared with six baseline models on multiple datasets (SML2010 and NASDAQ100), and applied to satellite state prediction (our private dataset). The effectiveness of our temporal self-attention was demonstrated by experiments. And the best shortterm prediction performance was achieved by our improved model.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:162 / 173
页数:12
相关论文
共 29 条
[1]   LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].
BENGIO, Y ;
SIMARD, P ;
FRASCONI, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166
[2]   Multivariate Time Series Forecasting with Dilated Residual Convolutional Neural Networks for Urban Air Quality Prediction [J].
Benhaddi, Meriem ;
Ouarzazi, Jamal .
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2021, 46 (04) :3423-3442
[3]  
Cao D., 2020, Proceedings of the 34th International Conference on Neural Information Processing Systems, P1491
[4]   Time series forecasting of COVID-19 transmission in Canada using LSTM networks [J].
Chimmula, Vinay Kumar Reddy ;
Zhang, Lei .
CHAOS SOLITONS & FRACTALS, 2020, 135
[5]  
Cho K., 2014, COMPUT SCI
[6]   News Sentiment Informed Time-series Analyzing AI (SITALA) to curb the spread of COVID-19 in Houston [J].
Desai, Prathamesh S. .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 180
[7]   FINDING STRUCTURE IN TIME [J].
ELMAN, JL .
COGNITIVE SCIENCE, 1990, 14 (02) :179-211
[8]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
[9]   Deep-Learning Forecasting Method for Electric Power Load via Attention-Based Encoder-Decoder with Bayesian Optimization [J].
Jin, Xue-Bo ;
Zheng, Wei-Zhen ;
Kong, Jian-Lei ;
Wang, Xiao-Yi ;
Bai, Yu-Ting ;
Su, Ting-Li ;
Lin, Seng .
ENERGIES, 2021, 14 (06)
[10]  
Kingma D. P., 2015, INT C LEARN REPR