Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization

被引:0
|
作者
Xie, Sang Michael [1 ]
Ma, Tengyu [1 ]
Liang, Percy [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139 | 2021年 / 139卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We focus on prediction problems with structured outputs that are subject to output validity constraints, e.g. pseudocode-to-code translation where the code must compile. While labeled input-output pairs are expensive to obtain, "unlabeled" outputs, i.e. outputs without corresponding inputs, are freely available (e.g. code on GitHub) and provide information about output validity. Pre-training captures this structure by training a denoiser to denoise corrupted versions of unlabeled outputs. We first show that standard fine-tuning after pre-training destroys some of this structure. We then propose composed fine-tuning, which trains a predictor composed with the pre-trained denoiser. Importantly, the denoiser is fixed to preserve output structure. Like standard fine-tuning, the predictor is also initialized with the pre-trained denoiser. We prove for two-layer ReLU networks that composed fine-tuning significantly reduces the complexity of the predictor, thus improving generalization. Empirically, we show that composed fine-tuning improves over standard fine-tuning on two pseudocode-to-code translation datasets (3% and 6% relative). The improvement is magnified on out-of-distribution (OOD) examples (4% and 25% relative), suggesting that reducing predictor complexity improves OOD extrapolation.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Pruning Pre-trained Language ModelsWithout Fine-Tuning
    Jiang, Ting
    Wang, Deqing
    Zhuang, Fuzhen
    Xie, Ruobing
    Xia, Feng
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 594 - 605
  • [2] Span Fine-tuning for Pre-trained Language Models
    Bao, Rongzhou
    Zhang, Zhuosheng
    Zhao, Hai
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1970 - 1979
  • [3] Overcoming Catastrophic Forgetting for Fine-Tuning Pre-trained GANs
    Zhang, Zeren
    Li, Xingjian
    Hong, Tao
    Wang, Tianyang
    Ma, Jinwen
    Xiong, Haoyi
    Xu, Cheng-Zhong
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT V, 2023, 14173 : 293 - 308
  • [4] Waste Classification by Fine-Tuning Pre-trained CNN and GAN
    Alsabei, Amani
    Alsayed, Ashwaq
    Alzahrani, Manar
    Al-Shareef, Sarah
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (08): : 65 - 70
  • [5] Fine-Tuning Pre-Trained Language Models with Gaze Supervision
    Deng, Shuwen
    Prasse, Paul
    Reich, David R.
    Scheffer, Tobias
    Jager, Lena A.
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 217 - 224
  • [6] Securely Fine-tuning Pre-trained Encoders Against Adversarial Examples
    Zhou, Ziqi
    Li, Minghui
    Liu, Wei
    Hu, Shengshan
    Zhang, Yechao
    Wang, Wei
    Xue, Lulu
    Zhang, Leo Yu
    Yao, Dezhong
    Jin, Hai
    45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, : 3015 - 3033
  • [7] Variational Monte Carlo on a Budget - Fine-tuning pre-trained NeuralWavefunctions
    Scherbela, Michael
    Gerard, Leon
    Grohs, Philipp
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [8] Fine-Tuning Pre-Trained CodeBERT for Code Search in Smart Contract
    JIN Huan
    LI Qinying
    Wuhan University Journal of Natural Sciences, 2023, 28 (03) : 237 - 245
  • [9] Fine-tuning Pre-trained Models for Robustness under Noisy Labels
    Ahn, Sumyeong
    Kim, Sihyeon
    Ko, Jongwoo
    Yun, Se-Young
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 3643 - 3651
  • [10] Debiasing Pre-Trained Language Models via Efficient Fine-Tuning
    Gira, Michael
    Zhang, Ruisu
    Lee, Kangwook
    PROCEEDINGS OF THE SECOND WORKSHOP ON LANGUAGE TECHNOLOGY FOR EQUALITY, DIVERSITY AND INCLUSION (LTEDI 2022), 2022, : 59 - 69