PAFormer: Anomaly Detection of Time Series With Parallel-Attention Transformer

被引:11
作者
Bai, Ningning [1 ]
Wang, Xiaofeng [1 ]
Han, Ruidong [2 ,3 ]
Wang, Qin [2 ]
Liu, Zinian [2 ]
机构
[1] Xian Univ Technol, Dept Math, Xian 710048, Peoples R China
[2] Xian Univ Technol, Sch Comp Sci & Engn, Xian 710048, Peoples R China
[3] Yuncheng Univ, Sch Math & Informat Technol, Yuncheng 044000, Peoples R China
基金
中国国家自然科学基金;
关键词
Anomaly detection; parallel-attention (PA); time series; transformer;
D O I
10.1109/TNNLS.2023.3337876
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Time-series anomaly detection is a critical task with significant impact as it serves a pivotal role in the field of data mining and quality management. Current anomaly detection methods are typically based on reconstruction or forecasting algorithms, as these methods have the capability to learn compressed data representations and model time dependencies. However, most methods rely on learning normal distribution patterns, which can be difficult to achieve in real-world engineering applications. Furthermore, real-world time-series data is highly imbalanced, with a severe lack of representative samples for anomalous data, which can lead to model learning failure. In this article, we propose a novel end-to-end unsupervised framework called the parallel-attention transformer (PAFormer), which discriminates anomalies by modeling both the global characteristics and local patterns of time series. Specifically, we construct parallel-attention (PA), which includes two core modules: the global enhanced representation module (GERM) and the local perception module (LPM). GERM consists of two pattern units and a normalization module, with attention weights that indicate the relationship of each data point to the whole series (global). Due to the rarity of anomalous points, they have strong associations with adjacent data points. LPM is composed of a learnable Laplace kernel function that learns the neighborhood relevancies through the distributional properties of the kernel function (local). We employ the PA to learn the global-local distributional differences for each data point, which enables us to discriminate anomalies. Finally, we propose a two-stage adversarial loss to optimize the model. We conduct experiments on five public benchmark datasets (real-world datasets) and one synthetic dataset. The results show that PAFormer outperforms state-of-the-art baselines.
引用
收藏
页码:3315 / 3328
页数:14
相关论文
共 50 条
[41]   Detecting Anomalous Multivariate Time-Series via Hybrid Machine Learning [J].
Terbuch, Anika ;
O'Leary, Paul ;
Khalili-Motlagh-Kasmaei, Negin ;
Auer, Peter ;
Zohrer, Alexander ;
Winter, Vincent .
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[42]   TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data [J].
Tuli, Shreshth ;
Casale, Giuliano ;
Jennings, Nicholas R. .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (06) :1201-1214
[43]  
Vaswani A, 2017, P 31 INT C NEUR INF, P6000, DOI DOI 10.48550/ARXIV.1706.03762
[44]   A Kernel Connectivity-based Outlier Factor Algorithm for Rare Data Detection in a Baking Process [J].
Wang, Yanxia ;
Li, Kang ;
Gan, Shaojun .
IFAC PAPERSONLINE, 2018, 51 (18) :297-302
[45]  
Xu J., 2022, P INT C LEARN REPR I, P1
[46]   Regularizing autoencoders with wavelet transform for sequence anomaly detection [J].
Yao, Yueyue ;
Ma, Jianghong ;
Ye, Yunming .
PATTERN RECOGNITION, 2023, 134
[47]  
Zhang CX, 2019, AAAI CONF ARTIF INTE, P1409
[48]  
Zhang WQ, 2022, PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, P2390
[49]   Unsupervised Deep Anomaly Detection for Multi-Sensor Time-Series Signals [J].
Zhang, Yuxin ;
Chen, Yiqiang ;
Wang, Jindong ;
Pan, Zhiwen .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (02) :2118-2132
[50]  
Zong B., 2018, P ICLR