What does the language system look like in pre-trained language models? A study using complex networks

被引：1

作者：

Zheng, Jianyu ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Chinese Language & Literature, Beijing 100084, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2024年 / 299卷

关键词：

Language model; BERT; Complex network; Language system;

D O I：

10.1016/j.knosys.2024.111984

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pre-trained language models has advanced the fields of natural language processing. The exceptional capabilities exhibited by PLMs in NLP tasks have been attracting researchers to explore the underlying factors responsible for their success. However, most of work primarily focus on studying some certain linguistic knowledge encoded in PLMs, rather than investigating how these models comprehend language from a holistic perspective. Furthermore, they cannot point out how PLMs organize the whole language system. Therefore, we adopt the complex network approach to represent the language system, and investigate how language elements are organized within the system. Specifically, we take the attention relationships among words as the research object, which are generated by attention heads within BERT models. Then, the words are treated as nodes, and the connections between words and their most-attending words are represented as edges. After obtaining these "words' attention networks", we analyze the network properties from various perspectives by calculating the network metrics. Many constructive conclusions are summarized, including: (1) The English attention networks demonstrate exceptional performance in organizing words; (2) Most words' attention networks exhibit small-world property and scale-free behavior; (3) Some networks generated by multilingual BERT can reflect typological information well, achieving preferable clustering performance among language groups; (4) In cross-layer analysis, the networks from 8 to 10 layers in Chinese BERT and from 6 to 9 layers in English BERT exhibit more consistent characteristics. Our study provides a comprehensive explanation of how PLMs organize language systems, which can be utilized to evaluate and develop improved models.

引用

页数：11

共 50 条

[31] GT-Finder: Classify the family of glucose transporters with pre-trained BERT language models
Shah, Syed Muazzam Ali
Taju, Semmy Wellem
Quang-Thai Ho
Trinh-Trung-Duong Nguyen
Ou, Yu-Yen
COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 131
[32] Audio-Aware Spoken Multiple-Choice Question Answering With Pre-Trained Language Models
Kuo, Chia-Chih
Chen, Kuan-Yu
Luo, Shang-Bao
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3170 - 3179
[33] Exploiting Pre-Trained Language Models for Black-Box Attack against Knowledge Graph Embeddings
Yang, Guangqian
Zhang, Lei
Liu, Yi
Xie, Hongtao
Mao, Zhendong
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 19 (01)
[34] Using a Pre-Trained Language Model for Medical Named Entity Extraction in Chinese Clinic Text
Zhang, Mengyuan
Wang, Jin
Zhang, Xuejie
PROCEEDINGS OF 2020 IEEE 10TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2020), 2020, : 312 - 317
[35] Using Large Pre-Trained Language Model to Assist FDA in Premarket Medical Device Classification
Xu, Zongzhe
SOUTHEASTCON 2023, 2023, : 159 - 166
[36] Sentiment Analysis Using Pre-Trained Language Model With No Fine-Tuning and Less Resource
Kit, Yuheng
Mokji, Musa Mohd
IEEE ACCESS, 2022, 10 : 107056 - 107065
[37] Robust Sentiment Classification of Metaverse Services Using a Pre-trained Language Model with Soft Voting
Lee, Haein
Jung, Hae Sun
Lee, Seon Hong
Kim, Jang Hyun
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2023, 17 (09): : 2334 - 2347
[38] ClassifAI: Automating Issue Reports Classification using Pre-Trained BERT (Bidirectional Encoder Representations from Transformers) Language Models
Alam, Khubaib Amjad
Jumani, Ashish
Aamir, Harris
Uzair, Muhammad
PROCEEDINGS 2024 ACM/IEEE INTERNATIONAL WORKSHOP ON NL-BASED SOFTWARE ENGINEERING, NLBSE 2024, 2024, : 49 - 52
[39] A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning
Kotei, Evans
Thirunavukarasu, Ramkumar
INFORMATION, 2023, 14 (03)
[40] Automated occupation coding with hierarchical features: a data-centric approach to classification with pre-trained language models
Safikhani P.
Avetisyan H.
Föste-Eggers D.
Broneske D.
Discover Artificial Intelligence, 2023, 3 (01):

← 1 2 3 4 5 →