Generic and Scalable Framework for Automated Time-series Anomaly Detection

被引:281
作者
Laptev, Nikolay [1 ]
Amizadeh, Saeed [1 ]
Flint, Ian [2 ]
机构
[1] Yahoo Labs, Sunnyvale, CA 94085 USA
[2] Yahoo, Sunnyvale, CA USA
来源
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING | 2015年
关键词
CHANGE-POINT DETECTION;
D O I
10.1145/2783258.2788611
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a generic and scalable framework for automated anomaly detection on large scale time-series data. Early detection of anomalies plays a key role in maintaining consistency of person's data and protects corporations against malicious attackers. Current state of the art anomaly detection approaches suffer from scalability, use-case restrictions, difficulty of use and a large number of false positives. Our system at Yahoo, EGADS, uses a collection of anomaly detection and forecasting models with an anomaly filtering layer for accurate and scalable anomaly detection on time series. We compare our approach against other anomaly detection systems on real and synthetic data with varying time-series characteristics. We found that our framework allows for 50-60% improvement in precision and recall for a variety of use-cases. Both the data and the framework are being open-sourced. The open-sourcing of the data, in particular, represents the first of its kind effort to establish the standard benchmark for anomaly detection.
引用
收藏
页码:1939 / 1947
页数:9
相关论文
共 31 条
  • [21] Change-point detection in time-series data by relative density-ratio estimation
    Liu, Song
    Yamada, Makoto
    Collier, Nigel
    Sugiyama, Masashi
    [J]. NEURAL NETWORKS, 2013, 43 : 72 - 83
  • [22] An algorithm based on singular spectrum analysis for change-point detection
    Moskvina, V
    Zhigljavsky, A
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2003, 32 (02) : 319 - 352
  • [23] Ray B. K., 2016, J TIME SERIES ANAL
  • [24] Rosner B., 2011, TECHNOMETRICS
  • [25] Storm @Twitter
    Toshniwa, Ankit
    Taneja, Siddarth
    Shukla, Amit
    Ramasamy, Karthik
    Patel, Jignesh M.
    Kulkarni, Sanjeev
    Jackson, Jason
    Gade, Krishna
    Fu, Maosong
    Donham, Jake
    Bhagat, Nikunj
    Mittal, Sailesh
    Ryaboy, Dmitriy
    [J]. SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, : 147 - 156
  • [26] Vallis O., 2014, USENIX
  • [27] van der Loo M. P. J., 2010, EXTREMEVALUES R PACK
  • [28] Venkataraman S., 2006, BLACK BOX ANOMALY DE
  • [29] Rule induction for forecasting method selection: Meta-learning the characteristics of univariate time series
    Wang, Xiaozhe
    Smith-Miles, Kate
    Hyndman, Rob
    [J]. NEUROCOMPUTING, 2009, 72 (10-12) : 2581 - 2594
  • [30] Wei W.W. S., 1994, Time series analysis