FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation

被引：0

作者：

Lakhotia, Kushal ^{[1
]}

Paranjape, Bhargavi ^{[2
]}

Ghoshal, Asish ^{[1
]}

Yih, Wen-tau ^{[1
]}

Mehdad, Yashar ^{[1
]}

Iyer, Srinivasan ^{[1
]}

机构：

[1] Facebook AI, New York, NY 10003 USA

[2] Univ Washington, Seattle, WA 98195 USA

来源：

2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021) | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Natural language (NL) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black-box pre-trained models, for tasks such as Question Answering (QA) and Fact Verification. Recently, pre-trained sequence to sequence (seq2seq) models have proven to be very effective in jointly making predictions, as well as generating NL explanations. However, these models have many shortcomings; they can fabricate explanations even for incorrect predictions, they are difficult to adapt to long input documents, and their training requires a large amount of labeled data. In this paper, we develop FiD-Ex(1), which addresses these shortcomings for seq2seq models by: 1) introducing sentence markers to eliminate explanation fabrication by encouraging extractive generation, 2) using the fusion-in-decoder architecture to handle long input contexts, and 3) intermediate fine-tuning on re-structured open domain QA datasets to improve few-shot performance. FiD-Ex significantly improves over prior work in terms of explanation metrics and task accuracy on five tasks from the ERASER explainability benchmark in both fully supervised and few-shot settings.

引用

页码：3712 / 3727

页数：16

共 50 条

[11] Sequence-to-Sequence Models for Automated Text Simplification
Botarleanu, Robert-Mihai
Dascalu, Mihai
Crossley, Scott Andrew
McNamara, Danielle S.
ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT II, 2020, 12164 : 31 - 36
[12] Sequence-to-Sequence Models for Emphasis Speech Translation
Quoc Truong Do
Sakti, Sakriani
Nakamura, Satoshi
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (10) : 1873 - 1883
[13] On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models
Michel, Paul
Li, Xian
Neubig, Graham
Pino, Juan Miguel
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3103 - 3114
[14] A Comparison of Sequence-to-Sequence Models for Speech Recognition
Prabhavalkar, Rohit
Rao, Kanishka
Sainath, Tara N.
Li, Bo
Johnson, Leif
Jaitly, Navdeep
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 939 - 943
[15] Learning Damage Representations with Sequence-to-Sequence Models
Yang, Qun
Shen, Dejian
SENSORS, 2022, 22 (02)
[16] On Sparsifying Encoder Outputs in Sequence-to-Sequence Models
Zhang, Biao
Titov, Ivan
Sennrich, Rico
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2888 - 2900
[17] Sequence-to-sequence Models for Cache Transition Systems
Peng, Xiaochang
Song, Linfeng
Gildea, Daniel
Satta, Giorgio
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1842 - 1852
[18] Context Dependent Trajectory Generation using Sequence-to-Sequence Models for Robotic Toilet Cleaning
Yang, Pin-Chu
Koganti, Nishanth
Ricardez, Gustavo Alfonso Garcia
Yamamoto, Masaki
Takamatsu, Jun
Ogasawara, Tsukasa
2020 29TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2020, : 932 - 937
[19] Automated Integration of Genomic Metadata with Sequence-to-Sequence Models
Cannizzaro, Giuseppe
Leone, Michele
Bernasconi, Anna
Canakoglu, Arif
Carman, Mark J.
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: APPLIED DATA SCIENCE AND DEMO TRACK, ECML PKDD 2020, PT V, 2021, 12461 : 187 - 203
[20] A Fuzzy Training Framework for Controllable Sequence-to-Sequence Generation
Li, Jiajia
Wang, Ping
Li, Zuchao
Liu, Xi
Utiyama, Masao
Sumita, Eiichiro
Zhao, Hai
Ai, Haojun
IEEE ACCESS, 2022, 10 : 92467 - 92480

← 1 2 3 4 5 →