Generic and Scalable Framework for Automated Time-series Anomaly Detection

被引:281
作者
Laptev, Nikolay [1 ]
Amizadeh, Saeed [1 ]
Flint, Ian [2 ]
机构
[1] Yahoo Labs, Sunnyvale, CA 94085 USA
[2] Yahoo, Sunnyvale, CA USA
来源
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING | 2015年
关键词
CHANGE-POINT DETECTION;
D O I
10.1145/2783258.2788611
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a generic and scalable framework for automated anomaly detection on large scale time-series data. Early detection of anomalies plays a key role in maintaining consistency of person's data and protects corporations against malicious attackers. Current state of the art anomaly detection approaches suffer from scalability, use-case restrictions, difficulty of use and a large number of false positives. Our system at Yahoo, EGADS, uses a collection of anomaly detection and forecasting models with an anomaly filtering layer for accurate and scalable anomaly detection on time series. We compare our approach against other anomaly detection systems on real and synthetic data with varying time-series characteristics. We found that our framework allows for 50-60% improvement in precision and recall for a variety of use-cases. Both the data and the framework are being open-sourced. The open-sourcing of the data, in particular, represents the first of its kind effort to establish the standard benchmark for anomaly detection.
引用
收藏
页码:1939 / 1947
页数:9
相关论文
共 31 条
  • [1] [Anonymous], 2004, Kalman filtering and neural networks
  • [2] [Anonymous], 2012, OUTLIER ANAL
  • [3] [Anonymous], 2006, CAMBRIDGE SERIES STA
  • [4] [Anonymous], FOURIER ANAL TIME SE
  • [5] LOF: Identifying density-based local outliers
    Breunig, MM
    Kriegel, HP
    Ng, RT
    Sander, J
    [J]. SIGMOD RECORD, 2000, 29 (02) : 93 - 104
  • [6] Chandola V., 2012, ACM COMPUT SURV
  • [7] Cleveland Robert B, 1990, J Off Stat, V6, P3, DOI DOI 10.1007/978-1-4613-4499-5_24
  • [8] Durbin J., 2012, TIME SERIES ANAL STA, V38
  • [9] NON-PARAMETRIC ESTIMATION OF A MULTIVARIATE PROBABILITY DENSITY
    EPANECHN.VA
    [J]. THEORY OF PROBILITY AND ITS APPLICATIONS,USSR, 1969, 14 (01): : 153 - &
  • [10] Freund Y., 1996, DECISION THEORETIC G