KILM: Knowledge Injection into Encoder-Decoder Language Models

被引:0
作者
Xu, Yan [1 ,2 ]
Namazifar, Mahdi [1 ]
Hazarika, Devamanyu [1 ]
Padmakumar, Aishwarya [1 ]
Liu, Yang [1 ]
Hakkani-Tur, Dilek [1 ]
机构
[1] Amazon Alexa AI, Seattle, WA 98121 USA
[2] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
来源
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large pre-trained language models (PLMs) have been shown to retain implicit knowledge within their parameters. To enhance this implicit knowledge, we propose Knowledge Injection into Language Models (KILM), a novel approach that injects entity-related knowledge into encoder-decoder PLMs, via a generative knowledge infilling objective through continued pre-training. This is done without architectural modifications to the PLMs or adding additional parameters. Experimental results over a suite of knowledge-intensive tasks spanning numerous datasets show that KILM enables models to retain more knowledge and hallucinate less while preserving their original performance on general NLU and NLG tasks. KILM also demonstrates improved zero-shot performances on tasks such as entity disambiguation, outperforming state-of-the-art models having 30x more parameters.(1)
引用
收藏
页码:5013 / 5035
页数:23
相关论文
共 69 条
[1]  
Agarwal O, 2021, 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), P3554
[2]  
Aghajanyan Armen, 2022, ARXIV220107520
[3]  
Aghajanyan Armen, 2021, ARXIV210706955
[4]  
Arora S, 2022, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), P1733
[5]  
Banerjee S., 2005, P ACL WORKSHOP INTRI, P65, DOI DOI 10.3115/1626355.1626389
[6]  
Berant Jonathan., 2013, P EMNLP, P1533
[7]  
Bollacker Kurt., P 2008 ACM SIGMOD IN
[8]  
Brigitte LM, 2017, NOMINAL APPOSITION I, V303
[9]  
Brown TB, 2020, ADV NEUR IN, V33
[10]  
De Cao N., 2020, INT C LEARN REPR