Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models

被引：6

作者：

Li, Wenbiao ^{[1
,2
]}

Sun, Rui ^{[1
,2
]}

Wu, Yunfang ^{[1
,3
]}

机构：

[1] Peking Univ, MOE Key Lab Computat Linguist, Beijing, Peoples R China

[2] Peking Univ, Sch Software & Microelect, Beijing, Peoples R China

[3] Peking Univ, Sch Comp Sci, Beijing, Peoples R China

来源：

NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I | 2022年 / 13551卷

基金：

中国国家自然科学基金;

关键词：

Word semantics; Character representation; Pre-trained models;

D O I：

10.1007/978-3-031-17120-8_1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most of the Chinese pre-trained models adopt characters as basic units for downstream tasks. However, these models ignore the information carried by words and thus lead to the loss of some important semantics. In this paper, we propose a new method to exploit word structure and integrate lexical semantics into character representations of pre-trained models. Specifically, we project a word's embedding into its internal characters' embeddings according to the similarity weight. To strengthen the word boundary information, we mix the representations of the internal characters within a word. After that, we apply a word-tocharacter alignment attention mechanism to emphasize important characters by masking unimportant ones. Moreover, in order to reduce the error propagation caused by word segmentation, we present an ensemble approach to combine segmentation results given by different tokenizers. The experimental results show that our approach achieves superior performance over the basic pre-trained models BERT, BERT-wwm and ERNIE on different Chinese NLP tasks: sentiment classification, sentence pair matching, natural language inference and machine reading comprehension. We make further analysis to prove the effectiveness of each component of our model.

引用

页码：3 / 15

页数：13

共 50 条

[1] Character-Level Syntax Infusion in Pre-Trained Models for Chinese Semantic Role Labeling
Wang, Yuxuan
Lei, Zhilin
Che, Wanxiang
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (12) : 3503 - 3515
[2] Character-Level Syntax Infusion in Pre-Trained Models for Chinese Semantic Role Labeling
Yuxuan Wang
Zhilin Lei
Wanxiang Che
International Journal of Machine Learning and Cybernetics, 2021, 12 : 3503 - 3515
[3] Pre-Trained Language Models and Their Applications
Wang, Haifeng
Li, Jiwei
Wu, Hua
Hovy, Eduard
Sun, Yu
ENGINEERING, 2023, 25 : 51 - 65
[4] Pre-Trained Models Based Receiver Design With Natural Redundancy for Chinese Characters
Wang, Zhen-Yu
Yu, Hong-Yi
Shen, Cai-Yao
Zhu, Zhao-Rui
Shen, Zhi-Xiang
Du, Jian-Ping
IEEE COMMUNICATIONS LETTERS, 2022, 26 (10) : 2350 - 2354
[5] Pre-trained models: Past, present and future
Han, Xu
Zhang, Zhengyan
Ding, Ning
Gu, Yuxian
Liu, Xiao
Huo, Yuqi
Qiu, Jiezhong
Yao, Yuan
Zhang, Ao
Zhang, Liang
Han, Wentao
Huang, Minlie
Jin, Qin
Lan, Yanyan
Liu, Yang
Liu, Zhiyuan
Lu, Zhiwu
Qiu, Xipeng
Song, Ruihua
Tang, Jie
Wen, Ji-Rong
Yuan, Jinhui
Zhao, Wayne Xin
Zhu, Jun
AI OPEN, 2021, 2 : 225 - 250
[6] Natural Attack for Pre-trained Models of Code
Yang, Zhou
Shi, Jieke
He, Junda
Lo, David
2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 1482 - 1493
[7] HinPLMs: Pre-trained Language Models for Hindi
Huang, Xixuan
Lin, Nankai
Li, Kexin
Wang, Lianxi
Gan, Suifu
2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 241 - 246
[8] Disambiguating Clinical Abbreviations using Pre-trained Word Embeddings
Jaber, Areej
Martinez, Paloma
HEALTHINF: PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES - VOL. 5: HEALTHINF, 2021, : 501 - 508
[9] Text clustering based on pre-trained models and autoencoders
Xu, Qiang
Gu, Hao
Ji, ShengWei
FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2024, 17
[10] Compressing Pre-trained Models of Code into 3 MB
Shi, Jieke
Yang, Zhou
Xu, Bowen
Kang, Hong Jin
Lo, David
PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,

← 1 2 3 4 5 →