Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models

被引：6

作者：

Li, Wenbiao ^{[1
,2
]}

Sun, Rui ^{[1
,2
]}

Wu, Yunfang ^{[1
,3
]}

机构：

[1] Peking Univ, MOE Key Lab Computat Linguist, Beijing, Peoples R China

[2] Peking Univ, Sch Software & Microelect, Beijing, Peoples R China

[3] Peking Univ, Sch Comp Sci, Beijing, Peoples R China

来源：

NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I | 2022年 / 13551卷

基金：

中国国家自然科学基金;

关键词：

Word semantics; Character representation; Pre-trained models;

D O I：

10.1007/978-3-031-17120-8_1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most of the Chinese pre-trained models adopt characters as basic units for downstream tasks. However, these models ignore the information carried by words and thus lead to the loss of some important semantics. In this paper, we propose a new method to exploit word structure and integrate lexical semantics into character representations of pre-trained models. Specifically, we project a word's embedding into its internal characters' embeddings according to the similarity weight. To strengthen the word boundary information, we mix the representations of the internal characters within a word. After that, we apply a word-tocharacter alignment attention mechanism to emphasize important characters by masking unimportant ones. Moreover, in order to reduce the error propagation caused by word segmentation, we present an ensemble approach to combine segmentation results given by different tokenizers. The experimental results show that our approach achieves superior performance over the basic pre-trained models BERT, BERT-wwm and ERNIE on different Chinese NLP tasks: sentiment classification, sentence pair matching, natural language inference and machine reading comprehension. We make further analysis to prove the effectiveness of each component of our model.

引用

页码：3 / 15

页数：13

共 50 条

[41] Analyzing Fine-Tune Pre-trained Models for Detecting Cucumber Plant Growth
Hari, Pragya
Singh, Maheshwari Prasad
ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2022, PT II, 2023, 1798 : 510 - 521
[42] Efficient Key-Based Adversarial Defense for ImageNet by Using Pre-Trained Models
Maungmaung, Aprilpyone
Echizen, Isao
Kiya, Hitoshi
IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 902 - 913
[43] Textual Pre-Trained Models for Age Screening Across Community Question-Answering
Figueroa, Alejandro
Timilsina, Mohan
IEEE ACCESS, 2024, 12 : 30030 - 30038
[44] Incorporating Pre-trained Transformer Models into TextCNN for Sentiment Analysis on Software Engineering Texts
Sun, Kexin
Shi, XiaoBo
Gao, Hui
Kuang, Hongyu
Ma, Xiaoxing
Rong, Guoping
Shao, Dong
Zhao, Zheng
Zhang, He
13TH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE, INTERNETWARE 2022, 2022, : 127 - 136
[45] Re-ranking Biomedical Literature for Precision Medicine with Pre-trained Neural Models
Jiazhao Li
Murali, Adharsh
Qiaozhu Mei
Vydiswaran, V. G. Vinod
2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 511 - 513
[46] RAPID: Zero-Shot Domain Adaptation for Code Search with Pre-Trained Models
Fan, Guodong
Chen, Shizhan
Gao, Cuiyun
Xiao, Jianmao
Zhang, Tao
Feng, Zhiyong
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (05)
[47] Artificial intelligence foundation and pre-trained models: Fundamentals, applications, opportunities, and social impacts
Kolides, Adam
Nawaz, Alyna
Rathor, Anshu
Beeman, Denzel
Hashmi, Muzammil
Fatima, Sana
Berdik, David
Al-Ayyoub, Mahmoud
Jararweh, Yaser
SIMULATION MODELLING PRACTICE AND THEORY, 2023, 126
[48] Efficient utilization of pre-trained models: A review of sentiment analysis via prompt learning
Bu, Kun
Liu, Yuanchao
Ju, Xiaolong
KNOWLEDGE-BASED SYSTEMS, 2024, 283
[49] DenseNet-201 and Xception Pre-Trained Deep Learning Models for Fruit Recognition
Salim, Farsana
Saeed, Faisal
Basurra, Shadi
Qasem, Sultan Noman
Al-Hadhrami, Tawfik
ELECTRONICS, 2023, 12 (14)
[50] Pre-Trained Models for Search and Recommendation: Introduction to the Special Issue-Part 1
Wang, Wenjie
Liu, Zheng
Feng, Fuli
Dou, Zhicheng
Ai, Qingyao
Yang, Grace Hui
Lian, Defu
Hou, Lu
Sun, Aixin
Zamani, Hamed
Metzler, Donald
de Rijke, Maarten
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2025, 43 (02)

← 1 2 3 4 5 →