ZARA: Improving Few-Shot Self-Rationalization for Small Language Models

被引:0
|
作者
Chen, Wei-Lin [1 ]
Yen, An-Zi [2 ]
Wu, Cheng-Kuang [1 ]
Huang, Hen-Hsen [3 ]
Chen, Hsin-Hsi [1 ]
机构
[1] Natl Taiwan Univ, Taipei, Taiwan
[2] Natl Yang Ming Chiao Tung Univ, Taipei, Taiwan
[3] Acad Sinica, Taipei, Taiwan
关键词
ERROR;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Language models (LMs) that jointly generate end-task answers as well as free-text rationales are known as self-rationalization models. Recent works demonstrate great performance gain for self-rationalization by few-shot prompting LMs with rationale-augmented exemplars. However, the ability to benefit from explanations only emerges with large-scale LMs, which have poor accessibility. In this work, we explore the less-studied setting of leveraging explanations for small LMs to improve few-shot self-rationalization. We first revisit the relationship between rationales and answers. Inspired by the implicit mental process of how human beings assess explanations, we present a novel approach, Zero-shot Augmentation of Rationale-Answer pairs (ZARA), to automatically construct pseudo-parallel data for self-training by reducing the problem of plausibility judgement to natural language inference. Experimental results show ZARA achieves SOTA performance on the FEB benchmark, for both the task accuracy and the explanation metric. In addition, we conduct human and quantitative evaluation validating ZARA's ability to automatically identify plausible and accurate rationale-answer pairs.(1)
引用
收藏
页码:4682 / 4693
页数:12
相关论文
共 50 条
  • [31] Revisiting Self-Training for Few-Shot Learning of Language Model
    Chen, Yiming
    Zhang, Yan
    Zhang, Chen
    Lee, Grandee
    Cheng, Ran
    Li, Haizhou
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9125 - 9135
  • [32] Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
    Wang, Zhenhailong
    Li, Manling
    Xu, Ruochen
    Zhou, Luowei
    Lei, Jie
    Lin, Xudong
    Wang, Shuohang
    Yang, Ziyi
    Zhu, Chenguang
    Hoiem, Derek
    Chang, Shih-Fu
    Bansal, Mohit
    Ji, Heng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [33] A Simple Method to Improve the Performance of Small Pre-trained Language Models on Few-shot Tasks
    Zhang, Yanan
    Wu, Chaofan
    Shi, Rongkun
    Zhang, Yiying
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 1572 - 1577
  • [34] Improving Augmentation Efficiency for Few-Shot Learning
    Cho, Wonhee
    Kim, Eunwoo
    IEEE ACCESS, 2022, 10 : 17697 - 17706
  • [35] Few-Shot Keyword Spotting in Any Language
    Mazumder, Mark
    Banbury, Colby
    Meyer, Josh
    Warden, Pete
    Reddi, Vijay Janapa
    INTERSPEECH 2021, 2021, : 4214 - 4218
  • [36] Improving Small Footprint Few-shot Keyword Spotting with Supervision on Auxiliary Data
    Yang, Seunghan
    Kim, Byeonggeun
    Shim, Kyuhong
    Chang, Simyung
    INTERSPEECH 2023, 2023, : 1633 - 1637
  • [37] Making Pre-trained Language Models Better Few-shot Learners
    Gao, Tianyu
    Fisch, Adam
    Chen, Danqi
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 3816 - 3830
  • [38] Language Models Can Improve Event Prediction by Few-Shot Abductive Reasoning
    Shi, Xiaoming
    Xue, Siqiao
    Wang, Kangrui
    Zhou, Fan
    Zhang, James Y.
    Zhou, Jun
    Tan, Chenhao
    Mei, Hongyuan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [39] Few-shot Knowledge Graph-to-Text Generation with Pretrained Language Models
    Li, Junyi
    Tang, Tianyi
    Zhao, Wayne Xin
    Wei, Zhicheng
    Yuan, Nicholas Jing
    Wen, Ji-Rong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1558 - 1568
  • [40] A few-shot learning method based on knowledge graph in large language models
    Wang, Feilong
    Shi, Donghui
    Aguilar, Jose
    Cui, Xinyi
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024,