Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models

被引:6
|
作者
Li, Wenbiao [1 ,2 ]
Sun, Rui [1 ,2 ]
Wu, Yunfang [1 ,3 ]
机构
[1] Peking Univ, MOE Key Lab Computat Linguist, Beijing, Peoples R China
[2] Peking Univ, Sch Software & Microelect, Beijing, Peoples R China
[3] Peking Univ, Sch Comp Sci, Beijing, Peoples R China
来源
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I | 2022年 / 13551卷
基金
中国国家自然科学基金;
关键词
Word semantics; Character representation; Pre-trained models;
D O I
10.1007/978-3-031-17120-8_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of the Chinese pre-trained models adopt characters as basic units for downstream tasks. However, these models ignore the information carried by words and thus lead to the loss of some important semantics. In this paper, we propose a new method to exploit word structure and integrate lexical semantics into character representations of pre-trained models. Specifically, we project a word's embedding into its internal characters' embeddings according to the similarity weight. To strengthen the word boundary information, we mix the representations of the internal characters within a word. After that, we apply a word-tocharacter alignment attention mechanism to emphasize important characters by masking unimportant ones. Moreover, in order to reduce the error propagation caused by word segmentation, we present an ensemble approach to combine segmentation results given by different tokenizers. The experimental results show that our approach achieves superior performance over the basic pre-trained models BERT, BERT-wwm and ERNIE on different Chinese NLP tasks: sentiment classification, sentence pair matching, natural language inference and machine reading comprehension. We make further analysis to prove the effectiveness of each component of our model.
引用
收藏
页码:3 / 15
页数:13
相关论文
共 50 条
  • [41] Analyzing Fine-Tune Pre-trained Models for Detecting Cucumber Plant Growth
    Hari, Pragya
    Singh, Maheshwari Prasad
    ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2022, PT II, 2023, 1798 : 510 - 521
  • [42] Efficient Key-Based Adversarial Defense for ImageNet by Using Pre-Trained Models
    Maungmaung, Aprilpyone
    Echizen, Isao
    Kiya, Hitoshi
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 902 - 913
  • [43] Textual Pre-Trained Models for Age Screening Across Community Question-Answering
    Figueroa, Alejandro
    Timilsina, Mohan
    IEEE ACCESS, 2024, 12 : 30030 - 30038
  • [44] Incorporating Pre-trained Transformer Models into TextCNN for Sentiment Analysis on Software Engineering Texts
    Sun, Kexin
    Shi, XiaoBo
    Gao, Hui
    Kuang, Hongyu
    Ma, Xiaoxing
    Rong, Guoping
    Shao, Dong
    Zhao, Zheng
    Zhang, He
    13TH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE, INTERNETWARE 2022, 2022, : 127 - 136
  • [45] Re-ranking Biomedical Literature for Precision Medicine with Pre-trained Neural Models
    Jiazhao Li
    Murali, Adharsh
    Qiaozhu Mei
    Vydiswaran, V. G. Vinod
    2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 511 - 513
  • [46] RAPID: Zero-Shot Domain Adaptation for Code Search with Pre-Trained Models
    Fan, Guodong
    Chen, Shizhan
    Gao, Cuiyun
    Xiao, Jianmao
    Zhang, Tao
    Feng, Zhiyong
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (05)
  • [47] Artificial intelligence foundation and pre-trained models: Fundamentals, applications, opportunities, and social impacts
    Kolides, Adam
    Nawaz, Alyna
    Rathor, Anshu
    Beeman, Denzel
    Hashmi, Muzammil
    Fatima, Sana
    Berdik, David
    Al-Ayyoub, Mahmoud
    Jararweh, Yaser
    SIMULATION MODELLING PRACTICE AND THEORY, 2023, 126
  • [48] Efficient utilization of pre-trained models: A review of sentiment analysis via prompt learning
    Bu, Kun
    Liu, Yuanchao
    Ju, Xiaolong
    KNOWLEDGE-BASED SYSTEMS, 2024, 283
  • [49] DenseNet-201 and Xception Pre-Trained Deep Learning Models for Fruit Recognition
    Salim, Farsana
    Saeed, Faisal
    Basurra, Shadi
    Qasem, Sultan Noman
    Al-Hadhrami, Tawfik
    ELECTRONICS, 2023, 12 (14)
  • [50] Pre-Trained Models for Search and Recommendation: Introduction to the Special Issue-Part 1
    Wang, Wenjie
    Liu, Zheng
    Feng, Fuli
    Dou, Zhicheng
    Ai, Qingyao
    Yang, Grace Hui
    Lian, Defu
    Hou, Lu
    Sun, Aixin
    Zamani, Hamed
    Metzler, Donald
    de Rijke, Maarten
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2025, 43 (02)