Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models

被引：6

作者：

Li, Wenbiao ^{[1
,2
]}

Sun, Rui ^{[1
,2
]}

Wu, Yunfang ^{[1
,3
]}

机构：

[1] Peking Univ, MOE Key Lab Computat Linguist, Beijing, Peoples R China

[2] Peking Univ, Sch Software & Microelect, Beijing, Peoples R China

[3] Peking Univ, Sch Comp Sci, Beijing, Peoples R China

来源：

NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I | 2022年 / 13551卷

基金：

中国国家自然科学基金;

关键词：

Word semantics; Character representation; Pre-trained models;

D O I：

10.1007/978-3-031-17120-8_1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most of the Chinese pre-trained models adopt characters as basic units for downstream tasks. However, these models ignore the information carried by words and thus lead to the loss of some important semantics. In this paper, we propose a new method to exploit word structure and integrate lexical semantics into character representations of pre-trained models. Specifically, we project a word's embedding into its internal characters' embeddings according to the similarity weight. To strengthen the word boundary information, we mix the representations of the internal characters within a word. After that, we apply a word-tocharacter alignment attention mechanism to emphasize important characters by masking unimportant ones. Moreover, in order to reduce the error propagation caused by word segmentation, we present an ensemble approach to combine segmentation results given by different tokenizers. The experimental results show that our approach achieves superior performance over the basic pre-trained models BERT, BERT-wwm and ERNIE on different Chinese NLP tasks: sentiment classification, sentence pair matching, natural language inference and machine reading comprehension. We make further analysis to prove the effectiveness of each component of our model.

引用

页码：3 / 15

页数：13

共 50 条

[21] An analysis of pre-trained stable diffusion models through a semantic lens
Bonechi, Simone
Andreini, Paolo
Corradini, Barbara Toniella
Scarselli, Franco
NEUROCOMPUTING, 2025, 614
[22] Aspect Based Sentiment Analysis using French Pre-Trained Models
Essebbar, Abderrahman
Kane, Bamba
Guinaudeau, Ophelie
Chiesa, Valeria
Quenel, Ilhem
Chau, Stephane
ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1, 2021, : 519 - 525
[23] Assessment of Convolutional Neural Network Pre-Trained Models for Detection and Orientation of Cracks
Qayyum, Waqas
Ehtisham, Rana
Bahrami, Alireza
Camp, Charles
Mir, Junaid
Ahmad, Afaq
MATERIALS, 2023, 16 (02)
[24] Comprehensive study of pre-trained language models: detecting humor in news headlines
Farah Shatnawi
Malak Abdullah
Mahmoud Hammad
Mahmoud Al-Ayyoub
Soft Computing, 2023, 27 : 2575 - 2599
[25] SAR Image Despeckling Using Pre-trained Convolutional Neural Network Models
Yang, Xiangli
Denis, Loic
Tupin, Florence
Yang, Wen
2019 JOINT URBAN REMOTE SENSING EVENT (JURSE), 2019,
[26] Pre-trained models for detection and severity level classification of dysarthria from speech
Javanmardi, Farhad
Kadiri, Sudarsana Reddy
Alku, Paavo
SPEECH COMMUNICATION, 2024, 158
[27] Assessing and improving syntactic adversarial robustness of pre-trained models for code translation
Yang, Guang
Zhou, Yu
Zhang, Xiangyu
Chen, Xiang
Han, Tingting
Chen, Taolue
INFORMATION AND SOFTWARE TECHNOLOGY, 2025, 181
[28] TARGET SPEECH EXTRACTION WITH PRE-TRAINED SELF-SUPERVISED LEARNING MODELS
Peng, Junyi
Delcroix, Marc
Ochiai, Tsubasa
Plchot, Oldrich
Araki, Shoko
Cemocky, Jan
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024), 2024, : 10421 - 10425
[29] Semi-supervised speaker verification system based on pre-trained models
Li, Yishuang
Chen, Zhicong
Miao, Shiyu
Su, Qi
Li, Lin
Hong, Qingyang
Qinghua Daxue Xuebao/Journal of Tsinghua University, 2024, 64 (11): : 1936 - 1943
[30] EnsUNet: Enhancing Brain Tumor Segmentation Through Fusion of Pre-trained Models
Laouamer, Ilhem
Aiadi, Oussama
Kherfi, Mohammed Lamine
Cheddad, Abbas
Amirat, Hanane
Laouamer, Lamri
Drid, Khaoula
PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, ICICT 2024, VOL 3, 2024, 1013 : 163 - 174

← 1 2 3 4 5 →