FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation

被引:0
作者
Lakhotia, Kushal [1 ]
Paranjape, Bhargavi [2 ]
Ghoshal, Asish [1 ]
Yih, Wen-tau [1 ]
Mehdad, Yashar [1 ]
Iyer, Srinivasan [1 ]
机构
[1] Facebook AI, New York, NY 10003 USA
[2] Univ Washington, Seattle, WA 98195 USA
来源
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021) | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural language (NL) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black-box pre-trained models, for tasks such as Question Answering (QA) and Fact Verification. Recently, pre-trained sequence to sequence (seq2seq) models have proven to be very effective in jointly making predictions, as well as generating NL explanations. However, these models have many shortcomings; they can fabricate explanations even for incorrect predictions, they are difficult to adapt to long input documents, and their training requires a large amount of labeled data. In this paper, we develop FiD-Ex(1), which addresses these shortcomings for seq2seq models by: 1) introducing sentence markers to eliminate explanation fabrication by encouraging extractive generation, 2) using the fusion-in-decoder architecture to handle long input contexts, and 3) intermediate fine-tuning on re-structured open domain QA datasets to improve few-shot performance. FiD-Ex significantly improves over prior work in terms of explanation metrics and task accuracy on five tasks from the ERASER explainability benchmark in both fully supervised and few-shot settings.
引用
收藏
页码:3712 / 3727
页数:16
相关论文
共 50 条
  • [41] Sequence-to-Sequence Models and Their Evaluation for Spoken Language Normalization of Slovenian
    Maucec, Mirjam Sepesy
    Verdonik, Darinka
    Donaj, Gregor
    APPLIED SCIENCES-BASEL, 2024, 14 (20):
  • [42] Improving Attention Based Sequence-to-Sequence Models for End-to-End English Conversational Speech Recognition
    Weng, Chao
    Cui, Jia
    Wang, Guangsen
    Wang, Jun
    Yu, Changzhu
    Su, Dan
    Yu, Dong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 761 - 765
  • [43] Unleashing the True Potential of Sequence-to-Sequence Models for Sequence Tagging and Structure Parsing
    He, Han
    Choi, Jinho D.
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 582 - 599
  • [44] A Dataset for Low-Resource Stylized Sequence-to-Sequence Generation
    Wu, Yu
    Wang, Yunli
    Liu, Shujie
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9290 - 9297
  • [45] Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models
    Liu, Bowen
    Ramsundar, Bharath
    Kawthekar, Prasad
    Shi, Jade
    Gomes, Joseph
    Quang Luu Nguyen
    Ho, Stephen
    Sloane, Jack
    Wender, Paul
    Pande, Vijay
    ACS CENTRAL SCIENCE, 2017, 3 (10) : 1103 - 1113
  • [46] Runoff predictions in ungauged basins using sequence-to-sequence models
    Yin, Hanlin
    Guo, Zilong
    Zhang, Xiuwei
    Chen, Jiaojiao
    Zhang, Yanning
    JOURNAL OF HYDROLOGY, 2021, 603
  • [47] Reformulating natural language queries using sequence-to-sequence models
    Xiaoyu Liu
    Shunda Pan
    Qi Zhang
    Yu-Gang Jiang
    Xuanjing Huang
    Science China Information Sciences, 2019, 62
  • [48] COUPLED TRAINING OF SEQUENCE-TO-SEQUENCE MODELS FOR ACCENTED SPEECH RECOGNITION
    Unni, Vinit
    Joshi, Nitish
    Jyothi, Preethi
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8254 - 8258
  • [49] BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
    Nguyen Luong Tran
    Duong Minh Le
    Dat Quoc Nguyen
    INTERSPEECH 2022, 2022, : 1751 - 1755
  • [50] Multitask Sequence-to-Sequence Models for Grapheme-to-Phoneme Conversion
    Milde, Benjamin
    Schmidt, Christoph
    Koehler, Joachim
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2536 - 2540