PT-Tuning: Bridging the Gap between Time Series Masked Reconstruction and Forecasting via Prompt Token Tuning

被引：0

作者：

Liu, Hao ^{[1
]}

Gan, Jinrui ^{[1
]}

Fan, Xiaoxuan ^{[1
]}

Zhang, Yi ^{[1
]}

Luo, Chuanxian ^{[2
]}

Zhang, Jing ^{[2
]}

Jiang, Guangxin ^{[3
]}

Qian, Yucheng ^{[4
]}

Zhao, Changwei ^{[4
]}

Ma, Huan ^{[5
]}

Guo, Zhenyu ^{[5
]}

机构：

[1] State Grid Smart Grid Res Inst Co Ltd, State Grid Lab Grid Adv Comp & Applicat, Beijing, Peoples R China

[2] State Grid Elect Power Res Inst Co Ltd, Wuhan Nan Ltd Liabil Co, Wuhan, Peoples R China

[3] State Grid Inner Mongolia East Power Co Ltd, Hohhot, Peoples R China

[4] State Grid Anhui Elect Power Co Ltd, Elect Power Res Inst, Beijing, Peoples R China

[5] State Grid Anhui Elect Power Co Ltd, Ultra High Voltage Co, Beijing, Peoples R China

来源：

DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2024, PT 2 | 2025年 / 14851卷

关键词：

Time series; Representation learning; Mask reconstruction; Forecasting; Fine-tuning;

D O I：

10.1007/978-981-97-5779-4_10

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Self-supervised learning has been actively studied in time series domain, especially for masked reconstruction. Most of these methods follow the "Pre-training + Fine-tuning" paradigm in which a new decoder replaces the pre-trained decoder to fit for a specific downstream task, leading to inconsistency of upstream and downstream tasks. In this paper, we point out that the unification of task objectives and adaptation for task difficulty are critical for bridging the gap between masked reconstruction and forecasting. By reserving pre-trained mask token during fine-tuning stage, forecasting can be taken as a special case of masked reconstruction, where future values are masked and reconstructed based on history values. It guarantees the consistency of task objectives but there is still a gap in task difficulty. Because masked reconstruction can utilize contextual information while forecasting can only use historical information to reconstruct. To further mitigate the existed gap, we propose a simple yet effective prompt token tuning (PT-Tuning) paradigm, in which all pre-trained parameters are frozen and only a few trainable prompt tokens are added to extended mask tokens in element-wise manner. Extensive experiments on real-world datasets demonstrate the superiority of proposed paradigm with state-of-the-art performance compared to representation learning and end-to-end forecasting methods.

引用

页码：147 / 162

页数：16

共 1 条

[1] DeepEX: Bridging the Gap Between Knowledge and Data Driven Techniques for Time Series Forecasting
Chattha, Muhammad Ali
Siddiqui, Shoaib Ahmed
Munir, Mohsin
Malik, Muhammad Imran
van Elst, Ludger
Dengel, Andreas
Ahmed, Sheraz
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: DEEP LEARNING, PT II, 2019, 11728 : 639 - 651

← 1 →