Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization

被引：0

作者：

Xie, Sang Michael ^{[1
]}

Ma, Tengyu ^{[1
]}

Liang, Percy ^{[1
]}

机构：

[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139 | 2021年 / 139卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We focus on prediction problems with structured outputs that are subject to output validity constraints, e.g. pseudocode-to-code translation where the code must compile. While labeled input-output pairs are expensive to obtain, "unlabeled" outputs, i.e. outputs without corresponding inputs, are freely available (e.g. code on GitHub) and provide information about output validity. Pre-training captures this structure by training a denoiser to denoise corrupted versions of unlabeled outputs. We first show that standard fine-tuning after pre-training destroys some of this structure. We then propose composed fine-tuning, which trains a predictor composed with the pre-trained denoiser. Importantly, the denoiser is fixed to preserve output structure. Like standard fine-tuning, the predictor is also initialized with the pre-trained denoiser. We prove for two-layer ReLU networks that composed fine-tuning significantly reduces the complexity of the predictor, thus improving generalization. Empirically, we show that composed fine-tuning improves over standard fine-tuning on two pseudocode-to-code translation datasets (3% and 6% relative). The improvement is magnified on out-of-distribution (OOD) examples (4% and 25% relative), suggesting that reducing predictor complexity improves OOD extrapolation.

引用

页数：12

共 50 条

[1] Pruning Pre-trained Language ModelsWithout Fine-Tuning
Jiang, Ting
Wang, Deqing
Zhuang, Fuzhen
Xie, Ruobing
Xia, Feng
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 594 - 605
[2] Span Fine-tuning for Pre-trained Language Models
Bao, Rongzhou
Zhang, Zhuosheng
Zhao, Hai
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1970 - 1979
[3] Overcoming Catastrophic Forgetting for Fine-Tuning Pre-trained GANs
Zhang, Zeren
Li, Xingjian
Hong, Tao
Wang, Tianyang
Ma, Jinwen
Xiong, Haoyi
Xu, Cheng-Zhong
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT V, 2023, 14173 : 293 - 308
[4] Waste Classification by Fine-Tuning Pre-trained CNN and GAN
Alsabei, Amani
Alsayed, Ashwaq
Alzahrani, Manar
Al-Shareef, Sarah
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (08): : 65 - 70
[5] Fine-Tuning Pre-Trained Language Models with Gaze Supervision
Deng, Shuwen
Prasse, Paul
Reich, David R.
Scheffer, Tobias
Jager, Lena A.
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 217 - 224
[6] Securely Fine-tuning Pre-trained Encoders Against Adversarial Examples
Zhou, Ziqi
Li, Minghui
Liu, Wei
Hu, Shengshan
Zhang, Yechao
Wang, Wei
Xue, Lulu
Zhang, Leo Yu
Yao, Dezhong
Jin, Hai
45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, : 3015 - 3033
[7] Variational Monte Carlo on a Budget - Fine-tuning pre-trained NeuralWavefunctions
Scherbela, Michael
Gerard, Leon
Grohs, Philipp
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[8] Fine-Tuning Pre-Trained CodeBERT for Code Search in Smart Contract
JIN Huan
LI Qinying
Wuhan University Journal of Natural Sciences, 2023, 28 (03) : 237 - 245
[9] Fine-tuning Pre-trained Models for Robustness under Noisy Labels
Ahn, Sumyeong
Kim, Sihyeon
Ko, Jongwoo
Yun, Se-Young
PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 3643 - 3651
[10] Debiasing Pre-Trained Language Models via Efficient Fine-Tuning
Gira, Michael
Zhang, Ruisu
Lee, Kangwook
PROCEEDINGS OF THE SECOND WORKSHOP ON LANGUAGE TECHNOLOGY FOR EQUALITY, DIVERSITY AND INCLUSION (LTEDI 2022), 2022, : 59 - 69

← 1 2 3 4 5 →