Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese Pre-trained Language Models

被引：0

作者：

Lai, Yuxuan ^{[1
,2
]}

Liu, Yijia ^{[3
]}

Feng, Yansong ^{[1
,2
]}

Huang, Songfang ^{[3
]}

Zhao, Dongyan ^{[1
,2
]}

机构：

[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing, Peoples R China

[2] Peking Univ, MOE Key Lab Computat Linguist, Beijing, Peoples R China

[3] Alibaba Grp, Beijing, Peoples R China

来源：

2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021) | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Chinese pre-trained language models usually process text as a sequence of characters, while ignoring more coarse granularity, e.g., words. In this work, we propose a novel pre-training paradigm for Chinese - Lattice-BERT, which explicitly incorporates word representations along with characters, thus can model a sentence in a multi-granularity manner. Specifically, we construct a lattice graph from the characters and words in a sentence and feed all these text units into transformers. We design a lattice position attention mechanism to exploit the lattice structures in self-attention layers. We further propose a masked segment prediction task to push the model to learn from rich but redundant information inherent in lattices, while avoiding learning unexpected tricks. Experiments on 11 Chinese natural language understanding tasks show that our model can bring an average increase of 1.5% under the 12-layer setting, which achieves new state-of-the-art among base-size models on the CLUE benchmarks. Further analysis shows that Lattice-BERT can harness the lattice structures, and the improvement comes from the exploration of redundant information and multi-granularity representations.(1)

引用

页码：1716 / 1731

页数：16

共 50 条

[1] H-ERNIE: A Multi-Granularity Pre-Trained Language Model for Web Search
Chu, Xiaokai
Zhao, Jiashu
Zou, Lixin
Yin, Dawei
PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1478 - 1489
[2] Leveraging Pre-trained BERT for Audio Captioning
Liu, Xubo
Mei, Xinhao
Huang, Qiushi
Sun, Jianyuan
Zhao, Jinzheng
Liu, Haohe
Plumbley, Mark D.
Kilic, Volkan
Wang, Wenwu
2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1145 - 1149
[3] Leveraging Pre-trained Language Models for Gender Debiasing
Jain, Nishtha
Popovic, Maja
Groves, Declan
Specia, Lucia
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2188 - 2195
[4] Character, Word, or Both? Revisiting the Segmentation Granularity for Chinese Pre-trained Language Models
Liang, Xinnian
Zhou, Zefan
Huang, Hui
Wu, Shuangzhi
Xiao, Tong
Yang, Muyun
Li, Zhoujun
Bian, Chao
arXiv, 2023,
[5] Leveraging pre-trained language models for code generation
Soliman, Ahmed
Shaheen, Samir
Hadhoud, Mayada
COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (03) : 3955 - 3980
[6] μBERT: Mutation Testing using Pre-Trained Language Models
Degiovanni, Renzo
Papadakis, Mike
2022 IEEE 15TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS (ICSTW 2022), 2022, : 160 - 169
[7] CAM-BERT: Chinese Aerospace Manufacturing Pre-trained Language Model
Dai, Jinchi
Wang, Shengren
Wang, Peiyan
Li, Ruiting
Chen, Jiaxin
Li, Xinrong
2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 361 - 365
[8] Inverse Problems Leveraging Pre-trained Contrastive Representations
Ravula, Sriram
Smyrnis, Georgios
Jordan, Matt
Dimakis, Alexandros G.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[9] Revisiting Pre-trained Models for Chinese Natural Language Processing
Cui, Yiming
Che, Wanxiang
Liu, Ting
Qin, Bing
Wang, Shijin
Hu, Guoping
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 657 - 668
[10] Pre-trained Language Model Representations for Language Generation
Edunov, Sergey
Baevski, Alexei
Auli, Michael
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4052 - 4059

← 1 2 3 4 5 →