Outlier Detection for Temporal Data: A Survey

被引:603
作者
Gupta, Manish [1 ]
Gao, Jing [2 ]
Aggarwal, Charu C. [3 ]
Han, Jiawei [4 ]
机构
[1] Microsoft, Hyderabad 500032, Andhra Pradesh, India
[2] SUNY Buffalo, Buffalo, NY 14260 USA
[3] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
[4] Univ Illinois, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
Temporal outlier detection; time series data; data streams; distributed data streams; temporal networks; spatio-temporal outliers; applications of temporal outlier detection; network outliers; TIME-SERIES; ANOMALY DETECTION; NOVELTY DETECTION; SYSTEM CALLS; ALGORITHMS; INTRUSIONS; SIMILARITY; PARAMETERS; REGRESSION; SEQUENCES;
D O I
10.1109/TKDE.2013.184
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the statistics community, outlier detection for time series data has been studied for decades. Recently, with advances in hardware and software technology, there has been a large body of work on temporal outlier detection from a computational perspective within the computer science community. In particular, advances in hardware technology have enabled the availability of various forms of temporal data collection mechanisms, and advances in software technology have enabled a variety of data management mechanisms. This has fueled the growth of different kinds of data sets such as data streams, spatio-temporal data, distributed streams, temporal networks, and time series data, generated by a multitude of applications. There arises a need for an organized and detailed study of the work done in the area of outlier detection with respect to such temporal datasets. In this survey, we provide a comprehensive and structured overview of a large set of interesting outlier definitions for various forms of temporal data, novel techniques, and application scenarios in which specific definitions and techniques have been widely used.
引用
收藏
页码:2250 / 2267
页数:18
相关论文
共 162 条
[1]  
Adam N.R., 2004, Proceedings of the 2004 ACM symposium on Applied computing, P576
[2]  
Aggarwal C. C., P 2008 SIAM INT C SD, P483
[3]  
Aggarwal CC, 2001, SIGMOD RECORD, V30, P37
[4]  
Aggarwal CC, 2011, PROC INT CONF DATA, P399, DOI 10.1109/ICDE.2011.5767885
[5]   On clustering massive text and categorical data streams [J].
Aggarwal, Charu C. ;
Yu, Philip S. .
KNOWLEDGE AND INFORMATION SYSTEMS, 2010, 24 (02) :171-196
[6]  
Aggarwal CC, 2005, SIAM PROC S, P80
[7]  
Angiulli F., 2007, CIKM, P811, DOI [10.1145/1321440.1321552, DOI 10.1145/1321440.1321552]
[8]  
[Anonymous], 2004, IMC
[9]  
[Anonymous], 1994, Wiley series in probability and mathematical statistics applied probability and statistics
[10]  
[Anonymous], 2003, KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, DOI DOI 10.1145/956750.956828