An efficient spatial-temporal transformer with temporal aggregation and spatial memory for traffic forecasting

被引:5
作者
Liu, Aoyu [1 ]
Zhang, Yaying [1 ,2 ]
机构
[1] Tongji Univ, Serv Comp, Key Lab Embedded Syst, Minist Educ, Shanghai, Peoples R China
[2] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Traffic forecasting; Transformer; Memory network; Data mining; REGRESSION; PREDICTION;
D O I
10.1016/j.eswa.2024.123884
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traffic forecasting technology has widespread applications in various domains, such as urban traffic planning and intelligent transportation systems. Traffic forecasting encounters challenges in effectively capturing the intricate spatial-temporal correlations in traffic data. While the latest methods have achieved satisfactory performance, they still suffer from two limitations: (i) Most methods overlook the memory of valuable traffic patterns at each traffic node, thus making it struggle to reveal dynamic spatial-temporal correlations using their inherent periodicity and trend characteristics from a broader perspective. (ii) As the research progresses, recently proposed models become increasingly complex and massive. To address these issues, we propose a Spatial-Temporal Aggregation Memory Transformer (STAMT) for traffic forecasting. Specifically, we propose a memory bank to enhance vanilla spatial attention and cache the traffic patterns of historical input. By querying these traffic patterns, which contain rich spatial-temporal semantic information, the model can optimize prediction performance by extracting trends and regularities across various periods. To reduce the computational costs, we introduce a temporal module to capture temporal correlations while reducing temporal dimension information. In addition, we leverage the random feature map and matrix multiplication associativity property, which reduce the quadratic complexity of spatial attention to linearity with regard to the number of nodes. Ultimately, our theoretical analysis concludes that a single-layer spatial attention network is sufficient to capture spatial-temporal correlations deeply without stacking. Extensive experiments on nine real-world datasets demonstrate that STAMT outperforms state-of-the-art baselines in regular, long-range, and large-scale traffic forecasting tasks while significantly reducing the computational costs. Codes are available at https://github.com/LiuAoyu1998/STAMT.
引用
收藏
页数:15
相关论文
共 55 条
[1]  
[Anonymous], 2012, Artificial Intelligence Applications to Critical Transportation Issues
[2]  
Bai L, 2020, ADV NEUR IN, V33
[3]  
Bruna J., 2014, C TRACK P
[4]  
Choi J, 2022, AAAI CONF ARTIF INTE, P6367
[5]  
Choromanski Krzysztof, 2020, INT C LEARN REPR
[6]  
Drucker H, 1997, ADV NEUR IN, V9, P155
[7]   Spatial-Temporal Graph ODE Networks for Traffic Flow Forecasting [J].
Fang, Zheng ;
Long, Qingqing ;
Song, Guojie ;
Xie, Kunqing .
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, :364-373
[8]   Adaptive Graph Spatial-Temporal Transformer Network for Traffic Forecasting [J].
Feng, Aosong ;
Tassiulas, Leandros .
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, :3933-3937
[9]  
Fu R, 2016, 2016 31ST YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), P324, DOI 10.1109/YAC.2016.7804912
[10]   Learning Dynamics and Heterogeneity of Spatial-Temporal Graph Data for Traffic Forecasting [J].
Guo, Shengnan ;
Lin, Youfang ;
Wan, Huaiyu ;
Li, Xiucheng ;
Cong, Gao .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (11) :5415-5428