Rethink the Top-u Attention in Sparse Self-attention for Long Sequence Time-Series Forecasting

被引:0
|
作者
Meng, Xiangxu [1 ]
Li, Wei [1 ,2 ]
Gaber, Tarek [3 ]
Zhao, Zheng [1 ]
Chen, Chuhao [1 ]
机构
[1] Harbin Engn Univ, Coll Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Harbin Engn Univ, Modeling & Emulat E Govt Natl Engn Lab, Harbin 150001, Peoples R China
[3] Univ Salford, Sch Sci Engn & Environm, Manchester, England
来源
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI | 2023年 / 14259卷
基金
中国国家自然科学基金;
关键词
Time-series; Top-u Attention; Long-tailed distribution; Sparse self-attention;
D O I
10.1007/978-3-031-44223-0_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Long time-series forecasting plays a crucial role in production and daily life, covering various areas such as electric power loads, stock trends and road traffic. Attention-based models have achieved significant performance advantages based on the long-term modelling capabilities of self-attention. However, regarding the criticized quadratic time complexity of the self-attention mechanism, most subsequent work has attempted to improve on it from the perspective of the sparse distribution of attention. In the main line of these works, we further investigate the position distribution of Top-u attention in the long-tail distribution of sparse attention and propose a two-stage self-attention mechanism, named ProphetAttention. Specifically, in the training phase, ProphetAttention memorizes the position of Top-u attention, and in the prediction phase, it uses the recorded position indices of Top-u attention to directly obtain Top-u attention for sparse attention computation, thereby avoiding the redundant computation of measuring Top-u attention. Results on four widely used real-world datasets demonstrate that ProphetAttention improves the prediction efficiency of long sequence time-series compared to the Informer model by approximately 17%-26% across all prediction horizons and significantly promotes prediction speed.
引用
收藏
页码:256 / 267
页数:12
相关论文
共 50 条
  • [21] Sparse self-attention aggregation networks for neural sequence slice interpolation
    Zejin Wang
    Jing Liu
    Xi Chen
    Guoqing Li
    Hua Han
    BioData Mining, 14
  • [22] Sparse self-attention aggregation networks for neural sequence slice interpolation
    Wang, Zejin
    Liu, Jing
    Chen, Xi
    Li, Guoqing
    Han, Hua
    BIODATA MINING, 2021, 14 (01)
  • [23] Time Series Self-Attention Approach for Human Motion Forecasting: A Baseline 2D Pose Forecasting
    Yunus, Andi Prademon
    Morita, Kento
    Shirai, Nobu C.
    Wakabayashi, Tetsushi
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2023, 27 (03) : 445 - 457
  • [24] A Multivariate Temporal Convolutional Attention Network for Time-Series Forecasting
    Wan, Renzhuo
    Tian, Chengde
    Zhang, Wei
    Deng, Wendi
    Yang, Fan
    ELECTRONICS, 2022, 11 (10)
  • [25] A Comparative Evaluation of Self-Attention Mechanism with ConvLSTM Model for Global Aerosol Time Series Forecasting
    Radivojevic, Dusan S.
    Lazovic, Ivan M. M.
    Mirkov, Nikola S. S.
    Ramadani, Uzahir R. R.
    Nikezic, Dusan P.
    MATHEMATICS, 2023, 11 (07)
  • [26] Network self attention for forecasting time series.
    Hu, Yuntong
    Xiao, Fuyuan
    APPLIED SOFT COMPUTING, 2022, 124
  • [27] Self-attention for raw optical Satellite Time Series Classification
    Russwurm, Marc
    Koerner, Marco
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2020, 169 : 421 - 435
  • [28] OnsitNet: A memory-capable online time series forecasting model incorporating a self-attention mechanism
    Liu, Hui
    Wang, Zhengkai
    Dong, Xiyao
    Du, Junzhao
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 259
  • [29] Multimodal Fusion of Optimized GRU-LSTM with Self-Attention Layer for Hydrological Time Series Forecasting
    Kilinc, Huseyin Cagan
    Apak, Sina
    Ozkan, Furkan
    Ergin, Mahmut Esad
    Yurtsever, Adem
    WATER RESOURCES MANAGEMENT, 2024, 38 (15) : 6045 - 6062
  • [30] Time-Series Forecasting Based on Multi-Layer Attention Architecture
    Wang, Na
    Zhao, Xianglian
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2024, 18 (01): : 1 - 14