Multistate time series imputation using generative adversarial network with applications to traffic data

被引:8
作者
Li, Haitao [1 ]
Cao, Qian [1 ]
Bai, Qiaowen [1 ]
Li, Zhihui [1 ]
Hu, Hongyu [2 ]
机构
[1] Jilin Univ, Coll Transportat, Changchun 130022, Peoples R China
[2] Jilin Univ, Coll Automot Engn, Changchun 130022, Peoples R China
基金
中国国家自然科学基金;
关键词
Generative adversarial network; Multiple imputation; Time series data; Imputation; MULTIPLE IMPUTATION; MISSING-DATA; SELF-REPRESENTATION; REGRESSION; MODEL;
D O I
10.1007/s00521-022-07961-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Time series missing data is a pervasive problem in many fields, especially in intelligent transportation system, which hinders the application of timing analysis methods and the fine adjustment of control strategies. The prevalent imputation approaches reconstruct missing data with a high accuracy by exploiting a precise distribution model. But the multistate characteristic of time series data and the uncertainty of imputation process increase the difficulty of modeling temporal data distribution and reduce the imputation performance. In this paper, a novel time series generative adversarial imputation network (TGAIN) model is proposed to deal with time series data missing problem. The model combines the advantages of GAN's data distribution modeling and multiple imputation's uncertainty handling. Specifically, the TGAIN network is designed and adversarial trained to learn the multistate distribution of missing time series data. Through the conditional vector constraint and adversarial imputation process, the latent distribution for each missing position under different states can be effectively estimated based on implicit relationships with partial observation information. Then the corresponding multiple imputation strategy is proposed to deal with the uncertainty of imputation process and it can determine the best fill value from the learned distribution. Furthermore, sufficient experiments have been conducted in two real traffic flow datasets. The comparative results show the proposed TGAIN not only has better ability on time series data distribution modeling and imputation uncertainty handling, but also performs more robustly and stability even with the missing rate increases.
引用
收藏
页码:6545 / 6567
页数:23
相关论文
共 45 条
[1]  
Arjovsky M, 2017, PR MACH LEARN RES, V70
[2]  
Asif MT, 2013, INT CONF ACOUST SPEE, P3527, DOI 10.1109/ICASSP.2013.6638314
[3]   Nearest neighbor regression in the presence of bad hubs [J].
Buza, Krisztian ;
Nanopoulos, Alexandros ;
Nagy, Gabor .
KNOWLEDGE-BASED SYSTEMS, 2015, 86 :250-260
[4]   Nonconvex lp-Norm Regularized Sparese Self-Representation for Traffic Sensor Data Recovery [J].
Chen, Xiaobo ;
Cai, Yingfeng ;
Liu, Qingchao ;
Chen, Lei .
IEEE ACCESS, 2018, 6 :24279-24290
[5]   Graph regularized local self-representation for missing value imputation with applications to on-road traffic sensor data [J].
Chen, Xiaobo ;
Cai, Yingfeng ;
Ye, Qiaolin ;
Chen, Lei ;
Li, Zuoyong .
NEUROCOMPUTING, 2018, 303 :47-59
[6]   Ensemble correlation-based low-rank matrix completion with applications to traffic data imputation [J].
Chen, Xiaobo ;
Wei, Zhongjie ;
Li, Zuoyong ;
Liang, Jun ;
Cai, Yingfeng ;
Zhang, Bob .
KNOWLEDGE-BASED SYSTEMS, 2017, 132 :249-262
[7]   Spatial-temporal traffic speed patterns discovery and incomplete data recovery via SVD-combined tensor decomposition [J].
Chen, Xinyu ;
He, Zhaocheng ;
Wang, Jiawei .
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2018, 86 :59-77
[8]  
Cheng-Xian Li S., 2019, INT C LEARNING REPRE
[9]   Multilevel Multiple Imputation: A Review and Evaluation of Joint Modeling and Chained Equations Imputation [J].
Enders, Craig K. ;
Mistler, Stephen A. ;
Keller, Brian T. .
PSYCHOLOGICAL METHODS, 2016, 21 (02) :222-240
[10]   Matrix completion by least-square, low-rank, and sparse self-representations [J].
Fan, Jicong ;
Chow, Tommy W. S. .
PATTERN RECOGNITION, 2017, 71 :290-305