Time-series anomaly detection with stacked Transformer representations and 1D convolutional network

被引:67
作者
Kim, Jina [1 ,2 ]
Kang, Hyeongwon [1 ]
Kang, Pilsung [1 ]
机构
[1] Korea Univ, Sch Ind & Management Engn, 145 Anam Ro, Seoul 02841, South Korea
[2] Shinhan Bank, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Time series anomaly detection; Transformer; Convolution Neural Network;
D O I
10.1016/j.engappai.2023.105964
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Time-series anomaly detection is a task of detecting data that do not follow normal data distribution among continuously collected data. It is used for system maintenance in various industries; hence, studies on time -series anomaly detection are being carried out actively. Most of the methodologies are based on Long Short-Term Memory (LSTM) and Convolution Neural Network (CNN) to model the temporal structure of time-series data. In this study, we propose an unsupervised prediction-based time-series anomaly detection methodology using Transformer, which shows superior performance to LSTM and CNN in learning dynamic patterns of sequential data through a self-attention mechanism. The prediction model consists of an encoder comprising multiple Transformer encoder layers and a decoder that includes a 1D convolution layer. The output representation of each Transformer layer is accumulated in the encoder to obtain a representation with multi-level, rich information. The decoder fuses this representation through a 1d convolution operation. Consequently, the model can perform predictions considering both the global trend and local variability of the input time-series. The anomaly score is defined as the difference between the predicted and the actual value at the corresponding timestamp, assuming that the trained model produces the predictions that follow the normal data distribution. Finally, the data with an anomaly score above the threshold is detected as an anomaly. Experiments on the benchmark datasets show that the proposed method has performance superior to those of the baselines.
引用
收藏
页数:12
相关论文
共 31 条
  • [1] Huang CZA, 2018, Arxiv, DOI arXiv:1809.04281
  • [2] Braei M, 2020, Arxiv, DOI [arXiv:2004.00433, DOI 10.48550/ARXIV.2004.00433]
  • [3] LOF: Identifying density-based local outliers
    Breunig, MM
    Kriegel, HP
    Ng, RT
    Sander, J
    [J]. SIGMOD RECORD, 2000, 29 (02) : 93 - 104
  • [4] Dosovitskiy A, 2021, INT C LEARNING REPRE
  • [5] Fang ML, 2020, PROCEEDINGS OF 2020 23RD INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION 2020), P233
  • [6] Gao J., 2020, arXiv
  • [7] TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks
    Geiger, Alexander
    Liu, Dongyu
    Alnegheimish, Sarah
    Cuesta-Infante, Alfredo
    Veeramachaneni, Kalyan
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 33 - 43
  • [8] Conformer: Convolution-augmented Transformer for Speech Recognition
    Gulati, Anmol
    Qin, James
    Chiu, Chung-Cheng
    Parmar, Niki
    Zhang, Yu
    Yu, Jiahui
    Han, Wei
    Wang, Shibo
    Zhang, Zhengdong
    Wu, Yonghui
    Pang, Ruoming
    [J]. INTERSPEECH 2020, 2020, : 5036 - 5040
  • [9] Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
  • [10] Lai K.-H., 2021, 35 C NEURAL INFORM P, P1