Time Series Data Cleaning: A Survey

被引:61
作者
Wang, Xi [1 ]
Wang, Chen [1 ]
机构
[1] Tsinghua Univ, Sch Software, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Data cleaning; data quality; time series; FIXING NUMERICAL ATTRIBUTES; OF-THE-ART; ANOMALY DETECTION; PARAMETER-ESTIMATION; MAXIMUM-LIKELIHOOD; OUTLIER DETECTION; TEMPORAL DATA; MARKOV MODEL; STATE; DATABASES;
D O I
10.1109/ACCESS.2019.2962152
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Errors are prevalent in time series data, which is particularly common in the industrial field. Data with errors could not be stored in the database, which results in the loss of data assets. At present, to deal with these time series containing errors, besides keeping original erroneous data, discarding erroneous data and manually checking erroneous data, we can also use the cleaning algorithm widely used in the database to automatically clean the time series data. This survey provides a classification of time series data cleaning techniques and comprehensively reviews the state-of-the-art methods of each type. Besides we summarize data cleaning tools, systems and evaluation criteria from research and industry. Finally, we highlight possible directions time series data cleaning.
引用
收藏
页码:1866 / 1881
页数:16
相关论文
共 128 条
[1]   Data Improving in Time Series Using ARX and ANN Models [J].
Akouemo, Hermine N. ;
Povinelli, Richard J. .
IEEE TRANSACTIONS ON POWER SYSTEMS, 2017, 32 (05) :3352-3359
[2]  
Alengrin G., 1978, Proceedings of the 1978 IEEE International Conference on Acoustics, Speech and Signal Processing, P208
[3]  
[Anonymous], 2005, P ACM SIGMOD INT C M, DOI DOI 10.1145/1066157.1066175
[4]  
[Anonymous], 2007, P 23 INT C DAT ENG I, DOI DOI 10.1109/ICDE.2007.367867
[5]  
[Anonymous], 2004, P 10 ACM SIGKDD INT, DOI DOI 10.1145/1014052.1014077
[6]  
[Anonymous], 2009, Database Theory-ICDT 2009,
[7]  
Aoqian Z., 2018, THESIS
[8]   Learning-Based Cleansing for Indoor RFID Data [J].
Baba, Asif Iqbal ;
Jaeger, Manfred ;
Lu, Hua ;
Pedersen, Torben Bach ;
Ku, Wei-Shinn ;
Xie, Xike .
SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, :925-936
[9]   Automatic outlier detection for time series: an application to sensor data [J].
Basu, Sabyasachi ;
Meckesheimer, Martin .
KNOWLEDGE AND INFORMATION SYSTEMS, 2007, 11 (02) :137-154
[10]   ON THE STRUCTURE OF ARMSTRONG RELATIONS FOR FUNCTIONAL-DEPENDENCIES [J].
BEERI, C ;
DOWD, M ;
FAGIN, R ;
STATMAN, R .
JOURNAL OF THE ACM, 1984, 31 (01) :30-46