A new distributional treatment for time series anomaly detection

被引:1
作者
Ting, Kai Ming [1 ]
Liu, Zongyou [1 ]
Gong, Lei [1 ]
Zhang, Hang [1 ]
Zhu, Ye [2 ]
机构
[1] Nanjing Univ, Sch Artificial Intelligence, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Deakin Univ, Ctr Cyber Resilience & Trust, Burwood, Vic, Australia
基金
中国国家自然科学基金;
关键词
Time series; Anomaly detection; Isolation kernel; Distributional kernel; SIMILARITY SEARCH; DISTANCE; SUPPORT;
D O I
10.1007/s00778-023-00832-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Time series is traditionally treated with two main approaches, i.e., the time domain approach and the frequency domain approach. These approaches must rely on a sliding window so that time-shift versions of a sequence can be measured to be similar. Coupled with the use of a root point-to-point measure, existing methods often have quadratic time complexity. We offer the third R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}$$\end{document} domain approach. It begins with an insight that sequences in a stationary time series can be treated as sets of independent and identically distributed (iid) points generated from an unknown distribution in R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}$$\end{document}. This R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}$$\end{document} domain treatment enables two new possibilities: (a) The similarity between two sequences can be computed using a distributional measure such as Wasserstein distance (WD), kernel mean embedding or isolation distributional kernel (KI\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {K}_I$$\end{document}), and (b) these distributional measures become non-sliding-window-based. Together, they offer an alternative that has more effective similarity measurements and runs significantly faster than the point-to-point and sliding-window-based measures. Our empirical evaluation shows that KI\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {K}_I$$\end{document} is an effective and efficient distributional measure for time series; and KI\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {K}_I$$\end{document}-based detectors have better detection accuracy than existing detectors in two tasks: (i) anomalous sequence detection in a stationary time series and (ii) anomalous time series detection in a dataset of non-stationary time series. The insight makes underutilized "old things new again" which gives existing distributional measures and anomaly detectors a new life in time series anomaly detection that would otherwise be impossible.
引用
收藏
页码:753 / 780
页数:28
相关论文
共 66 条
  • [1] Isolation-based anomaly detection using nearest-neighbor ensembles
    Bandaragoda, Tharindu R.
    Ting, Kai Ming
    Albrecht, David
    Liu, Fei Tony
    Zhu, Ye
    Wells, Jonathan R.
    [J]. COMPUTATIONAL INTELLIGENCE, 2018, 34 (04) : 968 - 998
  • [2] Time series anomaly detection based on shapelet learning
    Beggel, Laura
    Kausler, Bernhard X.
    Schiegg, Martin
    Pfeiffer, Michael
    Bischl, Bernd
    [J]. COMPUTATIONAL STATISTICS, 2019, 34 (03) : 945 - 976
  • [3] Unsupervised outlier detection for time series by entropy and dynamic time warping
    Benkabou, Seif-Eddine
    Benabdeslem, Khalid
    Canitia, Bruno
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 54 (02) : 463 - 486
  • [4] A Wasserstein Subsequence Kernel for Time Series
    Bock, Christian
    Togninalli, Matteo
    Ghisu, Elisabetta
    Gumbsch, Thomas
    Rieck, Bastian
    Borgwardt, Karsten
    [J]. 2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 970 - 975
  • [5] SAND: Streaming Subsequence Anomaly Detection
    Boniol, Paul
    Paparrizos, John
    Palpanas, Themis
    Franklin, Michael J.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (10): : 1717 - 1729
  • [6] Unsupervised and scalable subsequence anomaly detection in large data series
    Boniol, Paul
    Linardi, Michele
    Roncallo, Federico
    Palpanas, Themis
    Meftah, Mohammed
    Remy, Emmanuel
    [J]. VLDB JOURNAL, 2021, 30 (06) : 909 - 931
  • [7] LOF: Identifying density-based local outliers
    Breunig, MM
    Kriegel, HP
    Ng, RT
    Sander, J
    [J]. SIGMOD RECORD, 2000, 29 (02) : 93 - 104
  • [8] The Wasserstein-Fourier Distance for Stationary Time Series
    Cazelles, Elsa
    Robert, Arnaud
    Tobar, Felipe
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 : 709 - 721
  • [9] Haar wavelets for efficient similarity search of time-series: With and without time warping
    Chan, FKP
    Fu, AWC
    Yu, C
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (03) : 686 - 705
  • [10] MINIROCKET A Very Fast (Almost) Deterministic Transform for Time Series Classification
    Dempster, Angus
    Schmidt, Daniel F.
    Webb, Geoffrey, I
    [J]. KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 248 - 257