Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models

被引:6
|
作者
Li, Wenbiao [1 ,2 ]
Sun, Rui [1 ,2 ]
Wu, Yunfang [1 ,3 ]
机构
[1] Peking Univ, MOE Key Lab Computat Linguist, Beijing, Peoples R China
[2] Peking Univ, Sch Software & Microelect, Beijing, Peoples R China
[3] Peking Univ, Sch Comp Sci, Beijing, Peoples R China
来源
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I | 2022年 / 13551卷
基金
中国国家自然科学基金;
关键词
Word semantics; Character representation; Pre-trained models;
D O I
10.1007/978-3-031-17120-8_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of the Chinese pre-trained models adopt characters as basic units for downstream tasks. However, these models ignore the information carried by words and thus lead to the loss of some important semantics. In this paper, we propose a new method to exploit word structure and integrate lexical semantics into character representations of pre-trained models. Specifically, we project a word's embedding into its internal characters' embeddings according to the similarity weight. To strengthen the word boundary information, we mix the representations of the internal characters within a word. After that, we apply a word-tocharacter alignment attention mechanism to emphasize important characters by masking unimportant ones. Moreover, in order to reduce the error propagation caused by word segmentation, we present an ensemble approach to combine segmentation results given by different tokenizers. The experimental results show that our approach achieves superior performance over the basic pre-trained models BERT, BERT-wwm and ERNIE on different Chinese NLP tasks: sentiment classification, sentence pair matching, natural language inference and machine reading comprehension. We make further analysis to prove the effectiveness of each component of our model.
引用
收藏
页码:3 / 15
页数:13
相关论文
共 50 条
  • [21] An analysis of pre-trained stable diffusion models through a semantic lens
    Bonechi, Simone
    Andreini, Paolo
    Corradini, Barbara Toniella
    Scarselli, Franco
    NEUROCOMPUTING, 2025, 614
  • [22] Aspect Based Sentiment Analysis using French Pre-Trained Models
    Essebbar, Abderrahman
    Kane, Bamba
    Guinaudeau, Ophelie
    Chiesa, Valeria
    Quenel, Ilhem
    Chau, Stephane
    ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1, 2021, : 519 - 525
  • [23] Assessment of Convolutional Neural Network Pre-Trained Models for Detection and Orientation of Cracks
    Qayyum, Waqas
    Ehtisham, Rana
    Bahrami, Alireza
    Camp, Charles
    Mir, Junaid
    Ahmad, Afaq
    MATERIALS, 2023, 16 (02)
  • [24] Comprehensive study of pre-trained language models: detecting humor in news headlines
    Farah Shatnawi
    Malak Abdullah
    Mahmoud Hammad
    Mahmoud Al-Ayyoub
    Soft Computing, 2023, 27 : 2575 - 2599
  • [25] SAR Image Despeckling Using Pre-trained Convolutional Neural Network Models
    Yang, Xiangli
    Denis, Loic
    Tupin, Florence
    Yang, Wen
    2019 JOINT URBAN REMOTE SENSING EVENT (JURSE), 2019,
  • [26] Pre-trained models for detection and severity level classification of dysarthria from speech
    Javanmardi, Farhad
    Kadiri, Sudarsana Reddy
    Alku, Paavo
    SPEECH COMMUNICATION, 2024, 158
  • [27] Assessing and improving syntactic adversarial robustness of pre-trained models for code translation
    Yang, Guang
    Zhou, Yu
    Zhang, Xiangyu
    Chen, Xiang
    Han, Tingting
    Chen, Taolue
    INFORMATION AND SOFTWARE TECHNOLOGY, 2025, 181
  • [28] TARGET SPEECH EXTRACTION WITH PRE-TRAINED SELF-SUPERVISED LEARNING MODELS
    Peng, Junyi
    Delcroix, Marc
    Ochiai, Tsubasa
    Plchot, Oldrich
    Araki, Shoko
    Cemocky, Jan
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024), 2024, : 10421 - 10425
  • [29] Semi-supervised speaker verification system based on pre-trained models
    Li, Yishuang
    Chen, Zhicong
    Miao, Shiyu
    Su, Qi
    Li, Lin
    Hong, Qingyang
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2024, 64 (11): : 1936 - 1943
  • [30] EnsUNet: Enhancing Brain Tumor Segmentation Through Fusion of Pre-trained Models
    Laouamer, Ilhem
    Aiadi, Oussama
    Kherfi, Mohammed Lamine
    Cheddad, Abbas
    Amirat, Hanane
    Laouamer, Lamri
    Drid, Khaoula
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, ICICT 2024, VOL 3, 2024, 1013 : 163 - 174