Large-Scale Unusual Time Series Detection

被引:142
作者
Hyndman, Rob J. [1 ]
Wang, Earo [1 ]
Laptev, Nikolay [2 ]
机构
[1] Monash Univ, Clayton, Vic 3800, Australia
[2] Yahoo Res, San Francisco, CA USA
来源
2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW) | 2015年
关键词
Feature Space; Multivariate Anomaly Detection; Outliers; Time Series Characteristics;
D O I
10.1109/ICDMW.2015.104
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is becoming increasingly common for organizations to collect very large amounts of data over time, and to need to detect unusual or anomalous time series. For example, Yahoo has banks of mail servers that are monitored over time. Many measurements on server performance are collected every hour for each of thousands of servers. We wish to identify servers that are behaving unusually. We compute a vector of features on each time series, measuring characteristics of the series. The features may include lag correlation, strength of seasonality, spectral entropy, etc. Then we use a principal component decomposition on the features, and use various bivariate outlier detection methods applied to the first two principal components. This enables the most unusual series, based on their feature vectors, to be identified. The bivariate outlier detection methods used are based on highest density regions and alpha-hulls.
引用
收藏
页码:1616 / 1619
页数:4
相关论文
共 24 条
[1]  
Austin J., ARTIF INTELL REV
[2]  
Brutlag J.D., 2000, LISA 00 P 14 USENIX
[3]  
Candes Emmanuel J., 2011, Journal of the ACM, P1
[4]  
Cleveland Robert B, 1990, J Off Stat, V6, P3, DOI DOI 10.1007/978-1-4613-4499-5_24
[5]   Highly Comparative Feature-Based Time-Series Classification [J].
Fulcher, Ben D. ;
Jones, Nick S. .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (12) :3026-3037
[6]  
Goerg Georg M., 2013, P 30 INT C MACH LEAR
[7]  
Hyndman R.J., ANOMALOUS PACKAGE
[8]   Computing and graphing highest density regions [J].
Hyndman, RJ .
AMERICAN STATISTICIAN, 1996, 50 (02) :120-126
[9]   Rainbow Plots, Bagplots, and Boxplots for Functional Data [J].
Hyndman, Rob J. ;
Shang, Han Lin .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2010, 19 (01) :29-45
[10]   Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases [J].
Eamonn Keogh ;
Kaushik Chakrabarti ;
Michael Pazzani ;
Sharad Mehrotra .
Knowledge and Information Systems, 2001, 3 (3) :263-286