Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models

被引:6
|
作者
Li, Wenbiao [1 ,2 ]
Sun, Rui [1 ,2 ]
Wu, Yunfang [1 ,3 ]
机构
[1] Peking Univ, MOE Key Lab Computat Linguist, Beijing, Peoples R China
[2] Peking Univ, Sch Software & Microelect, Beijing, Peoples R China
[3] Peking Univ, Sch Comp Sci, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Word semantics; Character representation; Pre-trained models;
D O I
10.1007/978-3-031-17120-8_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of the Chinese pre-trained models adopt characters as basic units for downstream tasks. However, these models ignore the information carried by words and thus lead to the loss of some important semantics. In this paper, we propose a new method to exploit word structure and integrate lexical semantics into character representations of pre-trained models. Specifically, we project a word's embedding into its internal characters' embeddings according to the similarity weight. To strengthen the word boundary information, we mix the representations of the internal characters within a word. After that, we apply a word-tocharacter alignment attention mechanism to emphasize important characters by masking unimportant ones. Moreover, in order to reduce the error propagation caused by word segmentation, we present an ensemble approach to combine segmentation results given by different tokenizers. The experimental results show that our approach achieves superior performance over the basic pre-trained models BERT, BERT-wwm and ERNIE on different Chinese NLP tasks: sentiment classification, sentence pair matching, natural language inference and machine reading comprehension. We make further analysis to prove the effectiveness of each component of our model.
引用
收藏
页码:3 / 15
页数:13
相关论文
共 50 条
  • [1] Character, Word, or Both? Revisiting the Segmentation Granularity for Chinese Pre-trained Language Models
    Liang, Xinnian
    Zhou, Zefan
    Huang, Hui
    Wu, Shuangzhi
    Xiao, Tong
    Yang, Muyun
    Li, Zhoujun
    Bian, Chao
    arXiv, 2023,
  • [2] Pre-trained Affective Word Representations
    Chawla, Kushal
    Khosla, Sopan
    Chhaya, Niyati
    Jaidka, Kokil
    2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2019,
  • [3] Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention
    Li, Yanzeng
    Yu, Bowen
    Xue, Mengge
    Liu, Tingwen
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3442 - 3448
  • [4] Enhancing pre-trained language models with Chinese character morphological knowledge
    Zheng, Zhenzhong
    Wu, Xiaoming
    Liu, Xiangzhi
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (01)
  • [5] Capturing Semantics for Imputation with Pre-trained Language Models
    Mei, Yinan
    Song, Shaoxu
    Fang, Chenguang
    Yang, Haifeng
    Fang, Jingyun
    Long, Jiang
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 61 - 72
  • [6] MaskDiffusion: Exploiting Pre-Trained Diffusion Models for Semantic Segmentation
    Kawano, Yasufumi
    Aoki, Yoshimitsu
    IEEE ACCESS, 2024, 12 : 127283 - 127293
  • [7] Improved Word Sense Disambiguation Using Pre-Trained ContextualizedWord Representations
    Hadiwinoto, Christian
    Ng, Hwee Tou
    Gan, Wee Chung
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5297 - 5306
  • [8] CBAs: Character-level Backdoor Attacks against Chinese Pre-trained Language Models
    He, Xinyu
    Hao, Fengrui
    Gu, Tianlong
    Chang, Liang
    ACM TRANSACTIONS ON PRIVACY AND SECURITY, 2024, 27 (03)
  • [9] Character-Level Syntax Infusion in Pre-Trained Models for Chinese Semantic Role Labeling
    Wang, Yuxuan
    Lei, Zhilin
    Che, Wanxiang
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (12) : 3503 - 3515
  • [10] Character-Level Syntax Infusion in Pre-Trained Models for Chinese Semantic Role Labeling
    Yuxuan Wang
    Zhilin Lei
    Wanxiang Che
    International Journal of Machine Learning and Cybernetics, 2021, 12 : 3503 - 3515