Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models

被引:6
|
作者
Li, Wenbiao [1 ,2 ]
Sun, Rui [1 ,2 ]
Wu, Yunfang [1 ,3 ]
机构
[1] Peking Univ, MOE Key Lab Computat Linguist, Beijing, Peoples R China
[2] Peking Univ, Sch Software & Microelect, Beijing, Peoples R China
[3] Peking Univ, Sch Comp Sci, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Word semantics; Character representation; Pre-trained models;
D O I
10.1007/978-3-031-17120-8_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of the Chinese pre-trained models adopt characters as basic units for downstream tasks. However, these models ignore the information carried by words and thus lead to the loss of some important semantics. In this paper, we propose a new method to exploit word structure and integrate lexical semantics into character representations of pre-trained models. Specifically, we project a word's embedding into its internal characters' embeddings according to the similarity weight. To strengthen the word boundary information, we mix the representations of the internal characters within a word. After that, we apply a word-tocharacter alignment attention mechanism to emphasize important characters by masking unimportant ones. Moreover, in order to reduce the error propagation caused by word segmentation, we present an ensemble approach to combine segmentation results given by different tokenizers. The experimental results show that our approach achieves superior performance over the basic pre-trained models BERT, BERT-wwm and ERNIE on different Chinese NLP tasks: sentiment classification, sentence pair matching, natural language inference and machine reading comprehension. We make further analysis to prove the effectiveness of each component of our model.
引用
收藏
页码:3 / 15
页数:13
相关论文
共 50 条
  • [21] Ranking and Tuning Pre-trained Models: A New Paradigm for Exploiting Model Hubs
    You, Kaichao
    Liu, Yong
    Zhang, Ziyang
    Wang, Jianmin
    Jordan, Michael I.
    Long, Mingsheng
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [22] Exploiting Syntactic Information to Boost the Fine-tuning of Pre-trained Models
    Liu, Chaoming
    Zhu, Wenhao
    Zhang, Xiaoyu
    Zhai, Qiuhong
    2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 575 - 582
  • [23] On the Language Neutrality of Pre-trained Multilingual Representations
    Libovicky, Jindrich
    Rosa, Rudolf
    Fraser, Alexander
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1663 - 1674
  • [24] Improving Braille-Chinese translation with jointly trained and pre-trained language models
    Huang, Tianyuan
    Su, Wei
    Liu, Lei
    Cai, Chuan
    Yu, Hailong
    Yuan, Yongna
    DISPLAYS, 2024, 82
  • [25] Imparting Fairness to Pre-Trained Biased Representations
    Sadeghi, Bashir
    Boddeti, Vishnu Naresh
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 75 - 82
  • [26] TrOCR: Transformer-Based Optical Character Recognition with Pre-trained Models
    Li, Minghao
    Lv, Tengchao
    Chen, Jingye
    Cui, Lei
    Lu, Yijuan
    Florencio, Dinei
    Zhang, Cha
    Li, Zhoujun
    Wei, Furu
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13094 - 13102
  • [27] Connecting Pre-trained Language Models and Downstream Tasks via Properties of Representations
    Wu, Chenwei
    Lee, Holden
    Ge, Rong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [28] Radical-vectors with pre-trained models for Chinese Text Classification
    Yin, Guoqing
    Wu, Junmin
    Zhao, Guochao
    2022 EURO-ASIA CONFERENCE ON FRONTIERS OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, FCSIT, 2022, : 12 - 15
  • [29] Sub-word information in pre-trained biomedical word representations: evaluation and hyper-parameter optimization
    Galea, Dieter
    Laponogov, Ivan
    Veselkov, Kirill
    SIGBIOMED WORKSHOP ON BIOMEDICAL NATURAL LANGUAGE PROCESSING (BIONLP 2018), 2018, : 56 - 66
  • [30] PTCSpell: Pre-trained Corrector Based on Character Shape and Pinyin for Chinese Spelling Correction
    Wei, Xiao
    Huang, Jianbao
    Yu, Hang
    Liu, Qian
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 6330 - 6343