Synthetic Time Series Generation for Decision Intelligence Using Large Language Models

被引:2
作者
Grigoras, Alexandru [1 ]
Leon, Florin [1 ]
机构
[1] Gheorghe Asachi Tech Univ Iasi, Fac Automat Control & Comp Engn, Bd Mangeron 27, Iasi 700050, Romania
关键词
transformer architecture; large language models; synthetic data; time series; decision intelligence;
D O I
10.3390/math12162494
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
A model for generating synthetic time series data using pre-trained large language models is proposed. Starting with the Google T5-base model, which employs an encoder-decoder transformer architecture, the model underwent pre-training on diverse datasets. It was then fine-tuned using the QLoRA technique, which reduces computational complexity by quantizing weight parameters. The process involves the tokenization of time series data through mean scaling and quantization. The performance of the model was evaluated with fidelity, utility, and privacy metrics, showing improvements in fidelity and utility but a trade-off with reduced privacy. The proposed model offers a foundation for decision intelligence systems.
引用
收藏
页数:17
相关论文
共 45 条
[1]  
[Anonymous], Exploring Synthetic Data: Advantages and Use Cases
[2]  
Ansari AF, 2024, Arxiv, DOI [arXiv:2403.07815, DOI 10.48550/ARXIV.2403.07815]
[3]  
Borisov V., 2022, P 11 INT C LEARN REP
[4]  
Carballo KV, 2023, Arxiv, DOI arXiv:2206.10381
[5]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[6]  
Choi E., 2017, P 2 MACH LEARN HEALT, P286
[7]  
Creating Synthetic, Time Series Data for Global Financial Institutions-A POC Deep Dive
[8]  
Daneshfar F., 2024, Passer J Basic Appl Sci, V6, P265, DOI 10.24271/psr.2024.440793.1484
[9]  
Dettmers T, 2023, Arxiv, DOI [arXiv:2305.14314, DOI 10.48550/ARXIV.2305.14314]
[10]  
El Emam K., 2020, Practical synthetic data generation: balancing privacy and the broad availability of data