Towards Generating Real-World Time Series Data

被引:27
作者
Pei, Hengzhi [1 ,2 ]
Ren, Kan [2 ]
Yang, Yuqing [2 ]
Liu, Chang [3 ]
Qin, Tao [3 ]
Li, Dongsheng [2 ]
机构
[1] Univ Illinois, Urbana, IL USA
[2] Microsoft Res Asia, Shanghai, Peoples R China
[3] Microsoft Res Asia, Beijing, Peoples R China
来源
2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021) | 2021年
关键词
Time series; data generation; missing values;
D O I
10.1109/ICDM51629.2021.00058
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Time series data generation has drawn increasing attention in recent years. Several generative adversarial network (GAN) based methods have been proposed to tackle the problem usually with the assumption that the targeted time series data are well-formatted and complete. However, real-world time series (RTS) data are far away from this utopia, e.g., long sequences with variable lengths and informative missing data raise intractable challenges for designing powerful generation algorithms. In this paper, we propose a novel generative framework for RTS data - RTSGAN to tackle the aforementioned challenges. RTSGAN first learns an encoder-decoder module which provides a mapping between a time series instance and a fixed-dimension latent vector and then learns a generation module to generate vectors in the same latent space. By combining the generator and the decoder, RTSGAN is able to generate RTS which respect the original feature distributions and the temporal dynamics. To generate time series with missing values, we further equip RTSGAN with an observation embedding layer and a decide-and-generate decoder to better utilize the informative missing patterns. Experiments on the four RTS datasets show that the proposed framework outperforms the previous generation methods in terms of synthetic data utility for downstream classification and prediction tasks. Our code is available at https://seqml.github.io/rtsgan.
引用
收藏
页码:469 / 478
页数:10
相关论文
共 50 条
[41]   Artificial intelligence and secure use of health data in the KI-FDZ project: anonymization, synthetization, and secure processing of real-world data [J].
Prasser, Fabian ;
Riedel, Nico ;
Wolter, Steven ;
Corr, Doerte ;
Ludwig, Marion .
BUNDESGESUNDHEITSBLATT-GESUNDHEITSFORSCHUNG-GESUNDHEITSSCHUTZ, 2024, 67 (02) :171-179
[42]   Machine Learning for Emergency Service Optimization: A Real-World Application [J].
Zhong, Junyi ;
Abreu, Thiago ;
Heidet, Mathieu ;
Lucas, Francoise S. ;
Souihi, Sami .
2024 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CCECE 2024, 2024, :387-391
[43]   Towards Pattern-aware Privacy-preserving Real-time Data Collection [J].
Wang, Zhibo ;
Liu, Wenxin ;
Pang, Xiaoyi ;
Ren, Ju ;
Liu, Zhe ;
Chen, Yongle .
IEEE INFOCOM 2020 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS, 2020, :109-118
[44]   Preliminary and Real-Time Analysis of the Time Series Invariant [J].
Kolokolov, Yu. ;
Monovskaya, A. ;
Essounbouli, N. ;
Hamzaoui, A. ;
Litvinov, A. .
2009 IEEE INTERNATIONAL WORKSHOP ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS: TECHNOLOGY AND APPLICATIONS, 2009, :453-+
[45]   tsrobprep - an R package for robust preprocessing of time series data [J].
Narajewski, Michal ;
Kley-Holsteg, Jens ;
Ziel, Florian .
SOFTWAREX, 2021, 16
[46]   A Practical Yet Accurate Real-Time Statistical Analysis Library for Hydrologic Time-Series Big Data [J].
Sun, Jun ;
Ye, Feng ;
Nedjah, Nadia ;
Zhang, Ming ;
Xu, Dong .
WATER, 2023, 15 (04)
[47]   Introducing time series chains: a new primitive for time series data mining [J].
Zhu, Yan ;
Imamura, Makoto ;
Nikovski, Daniel ;
Keogh, Eamonn .
KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 60 (02) :1135-1161
[48]   Time series extrinsic regressionPredicting numeric values from time series data [J].
Chang Wei Tan ;
Christoph Bergmeir ;
François Petitjean ;
Geoffrey I. Webb .
Data Mining and Knowledge Discovery, 2021, 35 :1032-1060
[49]   Introducing time series chains: a new primitive for time series data mining [J].
Yan Zhu ;
Makoto Imamura ;
Daniel Nikovski ;
Eamonn Keogh .
Knowledge and Information Systems, 2019, 60 :1135-1161
[50]   Generating pseudo-random time series with specified marginal distributions [J].
Song, WMT ;
Hsiao, LC ;
Chen, YJ .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1996, 94 (01) :194-202