Unsupervised and scalable subsequence anomaly detection in large data series

被引:47
作者
Boniol, Paul [1 ]
Linardi, Michele [2 ]
Roncallo, Federico [2 ]
Palpanas, Themis [2 ]
Meftah, Mohammed [1 ]
Remy, Emmanuel [1 ]
机构
[1] EDF R&D, Paris, France
[2] Univ Paris, Paris, France
关键词
Data series; Time series; Anomalies discovery; TIME; DISCOVERY;
D O I
10.1007/s00778-021-00655-8
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Subsequence anomaly (or outlier) detection in long sequences is an important problem with applications in a wide range of domains. However, the approaches that have been proposed so far in the literature have severe limitations: they either require prior domain knowledge or become cumbersome and expensive to use in situations with recurrent anomalies of the same type. In this work, we address these problems and propose NormA, a novel approach, suitable for domain-agnostic anomaly detection. NormA is based on a new data series primitive, which permits to detect anomalies based on their (dis)similarity to a model that represents normal behavior. The experimental results on several real datasets demonstrate that the proposed approach correctly identifies all single and recurrent anomalies of various types, with no prior knowledge of the characteristics of these anomalies (except for their length). Moreover, it outperforms by a large margin the current state-of-the art algorithms in terms of accuracy, while being orders of magnitude faster.
引用
收藏
页码:909 / 931
页数:23
相关论文
共 63 条
[1]   Advanced bearing diagnostics: A comparative study of two powerful approaches [J].
Abboud, D. ;
Elbadaoui, M. ;
Smith, W. A. ;
Randall, R. B. .
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2019, 114 :604-627
[2]   Rotor health monitoring combining spin tests and data-driven anomaly detection methods [J].
Abdul-Aziz, Ali ;
Woike, Mark R. ;
Oza, Nikunj C. ;
Matthews, Bryan L. ;
Iekki, John D. .
STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2012, 11 (01) :3-12
[3]   Unsupervised real-time anomaly detection for streaming data [J].
Ahmad, Subutai ;
Lavin, Alexander ;
Purdy, Scott ;
Agha, Zuha .
NEUROCOMPUTING, 2017, 262 :134-147
[4]   A statistical methodology for the design of condition indicators [J].
Antoni, Jerome ;
Borghesani, Pietro .
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2019, 114 :290-327
[5]  
Bagnall A.J., 2019, Dagstuhl Rep., V9, P24, DOI DOI 10.4230/DAGREP.9.7.24
[6]  
Barnet V., 1994, OUTLIERS STAT DATA, V37, P256
[7]  
Boniol P., 2020, P VLDB 2020 PHD WORK, V2652
[8]   GraphAn: Graph-based Subsequence Anomaly Detection [J].
Boniol, Paul ;
Palpanas, Themis ;
Meftah, Mohammed ;
Remy, Emmanuel .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (12) :2941-2944
[9]   Series2Graph: Graph-based Subsequence Anomaly Detection for Time Series [J].
Boniol, Paul ;
Palpanas, Themis .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (11) :1821-1834
[10]   SAD: An Unsupervised System for Subsequence Anomaly Detection [J].
Boniol, Paul ;
Linardi, Michele ;
Roncallo, Federico ;
Palpanas, Themis .
2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, :1778-1781