FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation

被引:0
|
作者
Lakhotia, Kushal [1 ]
Paranjape, Bhargavi [2 ]
Ghoshal, Asish [1 ]
Yih, Wen-tau [1 ]
Mehdad, Yashar [1 ]
Iyer, Srinivasan [1 ]
机构
[1] Facebook AI, New York, NY 10003 USA
[2] Univ Washington, Seattle, WA 98195 USA
来源
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021) | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural language (NL) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black-box pre-trained models, for tasks such as Question Answering (QA) and Fact Verification. Recently, pre-trained sequence to sequence (seq2seq) models have proven to be very effective in jointly making predictions, as well as generating NL explanations. However, these models have many shortcomings; they can fabricate explanations even for incorrect predictions, they are difficult to adapt to long input documents, and their training requires a large amount of labeled data. In this paper, we develop FiD-Ex(1), which addresses these shortcomings for seq2seq models by: 1) introducing sentence markers to eliminate explanation fabrication by encouraging extractive generation, 2) using the fusion-in-decoder architecture to handle long input contexts, and 3) intermediate fine-tuning on re-structured open domain QA datasets to improve few-shot performance. FiD-Ex significantly improves over prior work in terms of explanation metrics and task accuracy on five tasks from the ERASER explainability benchmark in both fully supervised and few-shot settings.
引用
收藏
页码:3712 / 3727
页数:16
相关论文
共 50 条
  • [1] Neural AMR: Sequence-to-Sequence Models for Parsing and Generation
    Konstas, Ioannis
    Iyer, Srinivasan
    Yatskar, Mark
    Choi, Yejin
    Zettlemoyer, Luke
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 146 - 157
  • [2] Persian Keyphrase Generation Using Sequence-to-sequence Models
    Doostmohammadi, Ehsan
    Bokaei, Mohammad Hadi
    Sameti, Hossein
    2019 27TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE 2019), 2019, : 2010 - 2015
  • [3] Sparse Sequence-to-Sequence Models
    Peters, Ben
    Niculae, Vlad
    Martins, Andre F. T.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1504 - 1519
  • [4] Improving Sequence-to-Sequence Constituency Parsing
    Liu, Lemao
    Zhu, Muhua
    Shi, Shuming
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 4873 - 4880
  • [5] Data generation using sequence-to-sequence
    Joshi, Akshat
    Mehta, Kinal
    Gupta, Neha
    Valloli, Varun Kannadi
    2018 IEEE RECENT ADVANCES IN INTELLIGENT COMPUTATIONAL SYSTEMS (RAICS), 2018, : 108 - 112
  • [6] Assessing incrementality in sequence-to-sequence models
    Ulmer, Dennis
    Hupkes, Dieuwke
    Bruni, Elia
    4TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2019), 2019, : 209 - 217
  • [7] An Analysis of "Attention" in Sequence-to-Sequence Models
    Prabhavalkar, Rohit
    Sainath, Tara N.
    Li, Bo
    Rao, Kanishka
    Jaitly, Navdeep
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3702 - 3706
  • [8] Automatic Target Generation for Electronic Data Interchange using Sequence-to-Sequence Models
    Baysan, Mehmet Selman
    Kizilay, Furkan
    Gundogan, Haluk Harun
    Ozmen, Ayse Irem
    Ince, Gokhan
    INTELLIGENT AND FUZZY SYSTEMS, INFUS 2024 CONFERENCE, VOL 1, 2024, 1088 : 158 - 166
  • [9] Deep Reinforcement Learning for Sequence-to-Sequence Models
    Keneshloo, Yaser
    Shi, Tian
    Ramakrishnan, Naren
    Reddy, Chandan K.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (07) : 2469 - 2489
  • [10] Multilingual Sequence-to-Sequence Models for Hebrew NLP
    Eyal, Matan
    Noga, Hila
    Aharoni, Roee
    Szpektor, Idan
    Tsarfaty, Reut
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 7700 - 7708