Imputing incomplete time-series data based on varied-window similarity measure of data sequences

被引:13
作者
Chiewchanwattana, Sirapat
Lursinsap, Chidchanok [1 ]
Chu, Chee-Hung Henry
机构
[1] Chulalongkorn Univ, AVIC, Fac Sci, Bangkok 10330, Thailand
[2] Khon Kaen Univ, Fac Sci, Dept Comp Sci, Khon Kaen, Thailand
[3] Univ Louisiana, CACS, Lafayette, LA 70504 USA
关键词
incomplete time-series; fill-in method; varied-window similarity measure;
D O I
10.1016/j.patrec.2007.01.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a pattern characterization approach for the imputation of missing samples of time-series data. The new algorithm is based on the observation that time-series data that are manifestations of natural phenomena contain several sets of similar time-series subsequences. The imputation of missing samples is achieved by finding a complete subsequence that is similar to the missing sample subsequence and imputing the missing samples from this complete subsequence. The new algorithm is tested using standard benchmark as well as real-world data sets. The experimental results showed that the imputation accuracy of the proposed algorithm, referred to as the varied-window similarity measure (VWSM) algorithm, is comparable or better than traditional methods such as: the spline interpolation, the multiple imputation (MI), and the optimal completion strategy fuzzy c-means algorithm (OCSFCM) in case of non-stationary time-series data. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:1091 / 1103
页数:13
相关论文
共 15 条
[1]   Short-term forecasting of wind speed and related electrical power [J].
Alexiadis, MC ;
Dikopoulos, PS ;
Sahsamanoglou, HS ;
Manousaridis, IM .
SOLAR ENERGY, 1998, 63 (01) :61-68
[2]  
[Anonymous], LECT NOTES STAT
[3]   FI-GEM networks for incomplete time-series prediction [J].
Chiewchanwattana, S ;
Lursinsap, C .
PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, :1757-1762
[4]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[5]   METHODOLOGIES FOR THE ESTIMATION OF MISSING OBSERVATIONS IN TIME-SERIES [J].
FERREIRO, O .
STATISTICS & PROBABILITY LETTERS, 1987, 5 (01) :65-69
[6]   UNSUPERVISED OPTIMAL FUZZY CLUSTERING [J].
GATH, I ;
GEVA, AB .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1989, 11 (07) :773-781
[7]  
GHAHRAMANI, 1994, ADV NEURAL INFORM PR
[8]   Fuzzy c-means clustering of incomplete data [J].
Hathaway, RJ ;
Bezdek, JC .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2001, 31 (05) :735-744
[9]  
Little Roderick JA., 1987, Statistical analysis with missing data
[10]   Combined learning and use for a mixture model equivalent to the RBF classifier [J].
Miller, DJ ;
Uyar, HS .
NEURAL COMPUTATION, 1998, 10 (02) :281-293