Injecting Descriptive Meta-information into Pre-trained Language Models with Hypernetworks

被引:3
作者
Duan, Wenying [1 ]
He, Xiaoxi [2 ]
Zhou, Zimu [3 ]
Rao, Hong [1 ]
Thiele, Lothar [2 ]
机构
[1] Nanchang Univ, Nanchang, Jiangxi, Peoples R China
[2] Swiss Fed Inst Technol, Zurich, Switzerland
[3] Singapore Management Univ, Singapore, Singapore
来源
INTERSPEECH 2021 | 2021年
基金
瑞士国家科学基金会;
关键词
descriptive meta-information; hypernetworks; pre-trained language model;
D O I
10.21437/Interspeech.2021-229
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Pre-trained language models have been widely adopted as backbones in various natural language processing tasks. However, existing pre-trained language models ignore the descriptive meta-information in the text such as the distinction between the title and the mainbody, leading to over-weighted attention to insignificant text. In this paper, we propose a hypernetwork-based architecture to model the descriptive meta-information and integrate it into pre-trained language models. Evaluations on three natural language processing tasks show that our method notably improves the performance of pre-trained language models and achieves the state-of-the-art results on keyphrase extraction.
引用
收藏
页码:3216 / 3220
页数:5
相关论文
共 28 条
[1]  
Augenstein I., 2017, SEMEVAL ACL, DOI [10.18653/v1/S17-2091, 10.18653/v1/S17- 2091]
[2]  
Chen W, 2019, AAAI CONF ARTIF INTE, P6268
[3]  
Chuang Y.-S., 2020, ARXIV191011559
[4]   What does BERT look at? An Analysis of BERT's Attention [J].
Clark, Kevin ;
Khandelwal, Urvashi ;
Levy, Omer ;
Manning, Christopher D. .
BLACKBOXNLP WORKSHOP ON ANALYZING AND INTERPRETING NEURAL NETWORKS FOR NLP AT ACL 2019, 2019, :276-286
[5]  
Cui Y., 2020, ARXIV200413922, DOI 10.18653/v1/2020.findings-emnlp.58
[6]  
Del Corso Gianna M., 2005, P 14 INT C WORLD WID, P97, DOI DOI 10.1145/1060745.1060764
[7]  
Devlin J., 2019, CoRR, V1, P4171
[8]  
Ha D., 2016, INT C LEARNING REPRE
[9]  
Han Yizeng, 2021, arXiv
[10]  
Hulth A, 2003, PROCEEDINGS OF THE 2003 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, P216