FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation

被引:0
|
作者
Lakhotia, Kushal [1 ]
Paranjape, Bhargavi [2 ]
Ghoshal, Asish [1 ]
Yih, Wen-tau [1 ]
Mehdad, Yashar [1 ]
Iyer, Srinivasan [1 ]
机构
[1] Facebook AI, New York, NY 10003 USA
[2] Univ Washington, Seattle, WA 98195 USA
来源
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021) | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural language (NL) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black-box pre-trained models, for tasks such as Question Answering (QA) and Fact Verification. Recently, pre-trained sequence to sequence (seq2seq) models have proven to be very effective in jointly making predictions, as well as generating NL explanations. However, these models have many shortcomings; they can fabricate explanations even for incorrect predictions, they are difficult to adapt to long input documents, and their training requires a large amount of labeled data. In this paper, we develop FiD-Ex(1), which addresses these shortcomings for seq2seq models by: 1) introducing sentence markers to eliminate explanation fabrication by encouraging extractive generation, 2) using the fusion-in-decoder architecture to handle long input contexts, and 3) intermediate fine-tuning on re-structured open domain QA datasets to improve few-shot performance. FiD-Ex significantly improves over prior work in terms of explanation metrics and task accuracy on five tasks from the ERASER explainability benchmark in both fully supervised and few-shot settings.
引用
收藏
页码:3712 / 3727
页数:16
相关论文
共 50 条
  • [11] Sequence-to-Sequence Models for Automated Text Simplification
    Botarleanu, Robert-Mihai
    Dascalu, Mihai
    Crossley, Scott Andrew
    McNamara, Danielle S.
    ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT II, 2020, 12164 : 31 - 36
  • [12] Sequence-to-Sequence Models for Emphasis Speech Translation
    Quoc Truong Do
    Sakti, Sakriani
    Nakamura, Satoshi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (10) : 1873 - 1883
  • [13] On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models
    Michel, Paul
    Li, Xian
    Neubig, Graham
    Pino, Juan Miguel
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3103 - 3114
  • [14] A Comparison of Sequence-to-Sequence Models for Speech Recognition
    Prabhavalkar, Rohit
    Rao, Kanishka
    Sainath, Tara N.
    Li, Bo
    Johnson, Leif
    Jaitly, Navdeep
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 939 - 943
  • [15] Learning Damage Representations with Sequence-to-Sequence Models
    Yang, Qun
    Shen, Dejian
    SENSORS, 2022, 22 (02)
  • [16] On Sparsifying Encoder Outputs in Sequence-to-Sequence Models
    Zhang, Biao
    Titov, Ivan
    Sennrich, Rico
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2888 - 2900
  • [17] Sequence-to-sequence Models for Cache Transition Systems
    Peng, Xiaochang
    Song, Linfeng
    Gildea, Daniel
    Satta, Giorgio
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1842 - 1852
  • [18] Context Dependent Trajectory Generation using Sequence-to-Sequence Models for Robotic Toilet Cleaning
    Yang, Pin-Chu
    Koganti, Nishanth
    Ricardez, Gustavo Alfonso Garcia
    Yamamoto, Masaki
    Takamatsu, Jun
    Ogasawara, Tsukasa
    2020 29TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2020, : 932 - 937
  • [19] Automated Integration of Genomic Metadata with Sequence-to-Sequence Models
    Cannizzaro, Giuseppe
    Leone, Michele
    Bernasconi, Anna
    Canakoglu, Arif
    Carman, Mark J.
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: APPLIED DATA SCIENCE AND DEMO TRACK, ECML PKDD 2020, PT V, 2021, 12461 : 187 - 203
  • [20] A Fuzzy Training Framework for Controllable Sequence-to-Sequence Generation
    Li, Jiajia
    Wang, Ping
    Li, Zuchao
    Liu, Xi
    Utiyama, Masao
    Sumita, Eiichiro
    Zhao, Hai
    Ai, Haojun
    IEEE ACCESS, 2022, 10 : 92467 - 92480