Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese Pre-trained Language Models

被引:0
|
作者
Lai, Yuxuan [1 ,2 ]
Liu, Yijia [3 ]
Feng, Yansong [1 ,2 ]
Huang, Songfang [3 ]
Zhao, Dongyan [1 ,2 ]
机构
[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing, Peoples R China
[2] Peking Univ, MOE Key Lab Computat Linguist, Beijing, Peoples R China
[3] Alibaba Grp, Beijing, Peoples R China
来源
2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021) | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Chinese pre-trained language models usually process text as a sequence of characters, while ignoring more coarse granularity, e.g., words. In this work, we propose a novel pre-training paradigm for Chinese - Lattice-BERT, which explicitly incorporates word representations along with characters, thus can model a sentence in a multi-granularity manner. Specifically, we construct a lattice graph from the characters and words in a sentence and feed all these text units into transformers. We design a lattice position attention mechanism to exploit the lattice structures in self-attention layers. We further propose a masked segment prediction task to push the model to learn from rich but redundant information inherent in lattices, while avoiding learning unexpected tricks. Experiments on 11 Chinese natural language understanding tasks show that our model can bring an average increase of 1.5% under the 12-layer setting, which achieves new state-of-the-art among base-size models on the CLUE benchmarks. Further analysis shows that Lattice-BERT can harness the lattice structures, and the improvement comes from the exploration of redundant information and multi-granularity representations.(1)
引用
收藏
页码:1716 / 1731
页数:16
相关论文
共 50 条
  • [1] H-ERNIE: A Multi-Granularity Pre-Trained Language Model for Web Search
    Chu, Xiaokai
    Zhao, Jiashu
    Zou, Lixin
    Yin, Dawei
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1478 - 1489
  • [2] Leveraging Pre-trained BERT for Audio Captioning
    Liu, Xubo
    Mei, Xinhao
    Huang, Qiushi
    Sun, Jianyuan
    Zhao, Jinzheng
    Liu, Haohe
    Plumbley, Mark D.
    Kilic, Volkan
    Wang, Wenwu
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1145 - 1149
  • [3] Leveraging Pre-trained Language Models for Gender Debiasing
    Jain, Nishtha
    Popovic, Maja
    Groves, Declan
    Specia, Lucia
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2188 - 2195
  • [4] Character, Word, or Both? Revisiting the Segmentation Granularity for Chinese Pre-trained Language Models
    Liang, Xinnian
    Zhou, Zefan
    Huang, Hui
    Wu, Shuangzhi
    Xiao, Tong
    Yang, Muyun
    Li, Zhoujun
    Bian, Chao
    arXiv, 2023,
  • [5] Leveraging pre-trained language models for code generation
    Soliman, Ahmed
    Shaheen, Samir
    Hadhoud, Mayada
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (03) : 3955 - 3980
  • [6] μBERT: Mutation Testing using Pre-Trained Language Models
    Degiovanni, Renzo
    Papadakis, Mike
    2022 IEEE 15TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS (ICSTW 2022), 2022, : 160 - 169
  • [7] CAM-BERT: Chinese Aerospace Manufacturing Pre-trained Language Model
    Dai, Jinchi
    Wang, Shengren
    Wang, Peiyan
    Li, Ruiting
    Chen, Jiaxin
    Li, Xinrong
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 361 - 365
  • [8] Inverse Problems Leveraging Pre-trained Contrastive Representations
    Ravula, Sriram
    Smyrnis, Georgios
    Jordan, Matt
    Dimakis, Alexandros G.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [9] Revisiting Pre-trained Models for Chinese Natural Language Processing
    Cui, Yiming
    Che, Wanxiang
    Liu, Ting
    Qin, Bing
    Wang, Shijin
    Hu, Guoping
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 657 - 668
  • [10] Pre-trained Language Model Representations for Language Generation
    Edunov, Sergey
    Baevski, Alexei
    Auli, Michael
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4052 - 4059