PT-Tuning: Bridging the Gap between Time Series Masked Reconstruction and Forecasting via Prompt Token Tuning

被引:0
作者
Liu, Hao [1 ]
Gan, Jinrui [1 ]
Fan, Xiaoxuan [1 ]
Zhang, Yi [1 ]
Luo, Chuanxian [2 ]
Zhang, Jing [2 ]
Jiang, Guangxin [3 ]
Qian, Yucheng [4 ]
Zhao, Changwei [4 ]
Ma, Huan [5 ]
Guo, Zhenyu [5 ]
机构
[1] State Grid Smart Grid Res Inst Co Ltd, State Grid Lab Grid Adv Comp & Applicat, Beijing, Peoples R China
[2] State Grid Elect Power Res Inst Co Ltd, Wuhan Nan Ltd Liabil Co, Wuhan, Peoples R China
[3] State Grid Inner Mongolia East Power Co Ltd, Hohhot, Peoples R China
[4] State Grid Anhui Elect Power Co Ltd, Elect Power Res Inst, Beijing, Peoples R China
[5] State Grid Anhui Elect Power Co Ltd, Ultra High Voltage Co, Beijing, Peoples R China
来源
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2024, PT 2 | 2025年 / 14851卷
关键词
Time series; Representation learning; Mask reconstruction; Forecasting; Fine-tuning;
D O I
10.1007/978-981-97-5779-4_10
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Self-supervised learning has been actively studied in time series domain, especially for masked reconstruction. Most of these methods follow the "Pre-training + Fine-tuning" paradigm in which a new decoder replaces the pre-trained decoder to fit for a specific downstream task, leading to inconsistency of upstream and downstream tasks. In this paper, we point out that the unification of task objectives and adaptation for task difficulty are critical for bridging the gap between masked reconstruction and forecasting. By reserving pre-trained mask token during fine-tuning stage, forecasting can be taken as a special case of masked reconstruction, where future values are masked and reconstructed based on history values. It guarantees the consistency of task objectives but there is still a gap in task difficulty. Because masked reconstruction can utilize contextual information while forecasting can only use historical information to reconstruct. To further mitigate the existed gap, we propose a simple yet effective prompt token tuning (PT-Tuning) paradigm, in which all pre-trained parameters are frozen and only a few trainable prompt tokens are added to extended mask tokens in element-wise manner. Extensive experiments on real-world datasets demonstrate the superiority of proposed paradigm with state-of-the-art performance compared to representation learning and end-to-end forecasting methods.
引用
收藏
页码:147 / 162
页数:16
相关论文
共 1 条
  • [1] DeepEX: Bridging the Gap Between Knowledge and Data Driven Techniques for Time Series Forecasting
    Chattha, Muhammad Ali
    Siddiqui, Shoaib Ahmed
    Munir, Mohsin
    Malik, Muhammad Imran
    van Elst, Ludger
    Dengel, Andreas
    Ahmed, Sheraz
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: DEEP LEARNING, PT II, 2019, 11728 : 639 - 651