Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models

被引:6
|
作者
Li, Wenbiao [1 ,2 ]
Sun, Rui [1 ,2 ]
Wu, Yunfang [1 ,3 ]
机构
[1] Peking Univ, MOE Key Lab Computat Linguist, Beijing, Peoples R China
[2] Peking Univ, Sch Software & Microelect, Beijing, Peoples R China
[3] Peking Univ, Sch Comp Sci, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Word semantics; Character representation; Pre-trained models;
D O I
10.1007/978-3-031-17120-8_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of the Chinese pre-trained models adopt characters as basic units for downstream tasks. However, these models ignore the information carried by words and thus lead to the loss of some important semantics. In this paper, we propose a new method to exploit word structure and integrate lexical semantics into character representations of pre-trained models. Specifically, we project a word's embedding into its internal characters' embeddings according to the similarity weight. To strengthen the word boundary information, we mix the representations of the internal characters within a word. After that, we apply a word-tocharacter alignment attention mechanism to emphasize important characters by masking unimportant ones. Moreover, in order to reduce the error propagation caused by word segmentation, we present an ensemble approach to combine segmentation results given by different tokenizers. The experimental results show that our approach achieves superior performance over the basic pre-trained models BERT, BERT-wwm and ERNIE on different Chinese NLP tasks: sentiment classification, sentence pair matching, natural language inference and machine reading comprehension. We make further analysis to prove the effectiveness of each component of our model.
引用
收藏
页码:3 / 15
页数:13
相关论文
共 50 条
  • [31] Refining Pre-Trained Motion Models
    Sun, Xinglong
    Harley, Adam W.
    Guibas, Leonidas J.
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 4932 - 4938
  • [32] Efficiently Robustify Pre-Trained Models
    Jain, Nishant
    Behl, Harkirat
    Rawat, Yogesh Singh
    Vineet, Vibhav
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5482 - 5492
  • [33] Pre-trained Models for Sonar Images
    Valdenegro-Toro, Matias
    Preciado-Grijalva, Alan
    Wehbe, Bilal
    OCEANS 2021: SAN DIEGO - PORTO, 2021,
  • [34] Pre-Trained Language Models and Their Applications
    Wang, Haifeng
    Li, Jiwei
    Wu, Hua
    Hovy, Eduard
    Sun, Yu
    ENGINEERING, 2023, 25 : 51 - 65
  • [35] TextObfuscator: Making Pre-trained Language Model a Privacy Protector via Obfuscating Word Representations
    Zhou, Xin
    Lu, Yi
    Ma, Ruotian
    Gui, Tao
    Wang, Yuran
    Ding, Yong
    Zhang, Yibo
    Zhang, Qi
    Huang, Xuanjing
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5459 - 5473
  • [36] Efficient word segmentation for enhancing Chinese spelling check in pre-trained language model
    Li, Fangfang
    Jiang, Jie
    Tang, Dafu
    Shan, Youran
    Duan, Junwen
    Zhang, Shichao
    KNOWLEDGE AND INFORMATION SYSTEMS, 2025, 67 (01) : 603 - 632
  • [37] Pre-trained molecular representations enable antimicrobial discovery
    Roberto Olayo-Alarcon
    Martin K. Amstalden
    Annamaria Zannoni
    Medina Bajramovic
    Cynthia M. Sharma
    Ana Rita Brochado
    Mina Rezaei
    Christian L. Müller
    Nature Communications, 16 (1)
  • [38] Assessing Multilingual Fairness in Pre-trained Multimodal Representations
    Wang, Jialu
    Liu, Yang
    Wang, Xin Eric
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2681 - 2695
  • [39] Pre-trained Language Model Representations for Language Generation
    Edunov, Sergey
    Baevski, Alexei
    Auli, Michael
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4052 - 4059
  • [40] Inverse Problems Leveraging Pre-trained Contrastive Representations
    Ravula, Sriram
    Smyrnis, Georgios
    Jordan, Matt
    Dimakis, Alexandros G.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34