What does the language system look like in pre-trained language models? A study using complex networks

被引:1
|
作者
Zheng, Jianyu [1 ]
机构
[1] Tsinghua Univ, Dept Chinese Language & Literature, Beijing 100084, Peoples R China
关键词
Language model; BERT; Complex network; Language system;
D O I
10.1016/j.knosys.2024.111984
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained language models has advanced the fields of natural language processing. The exceptional capabilities exhibited by PLMs in NLP tasks have been attracting researchers to explore the underlying factors responsible for their success. However, most of work primarily focus on studying some certain linguistic knowledge encoded in PLMs, rather than investigating how these models comprehend language from a holistic perspective. Furthermore, they cannot point out how PLMs organize the whole language system. Therefore, we adopt the complex network approach to represent the language system, and investigate how language elements are organized within the system. Specifically, we take the attention relationships among words as the research object, which are generated by attention heads within BERT models. Then, the words are treated as nodes, and the connections between words and their most-attending words are represented as edges. After obtaining these "words' attention networks", we analyze the network properties from various perspectives by calculating the network metrics. Many constructive conclusions are summarized, including: (1) The English attention networks demonstrate exceptional performance in organizing words; (2) Most words' attention networks exhibit small-world property and scale-free behavior; (3) Some networks generated by multilingual BERT can reflect typological information well, achieving preferable clustering performance among language groups; (4) In cross-layer analysis, the networks from 8 to 10 layers in Chinese BERT and from 6 to 9 layers in English BERT exhibit more consistent characteristics. Our study provides a comprehensive explanation of how PLMs organize language systems, which can be utilized to evaluate and develop improved models.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] GT-Finder: Classify the family of glucose transporters with pre-trained BERT language models
    Shah, Syed Muazzam Ali
    Taju, Semmy Wellem
    Quang-Thai Ho
    Trinh-Trung-Duong Nguyen
    Ou, Yu-Yen
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 131
  • [32] Audio-Aware Spoken Multiple-Choice Question Answering With Pre-Trained Language Models
    Kuo, Chia-Chih
    Chen, Kuan-Yu
    Luo, Shang-Bao
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3170 - 3179
  • [33] Exploiting Pre-Trained Language Models for Black-Box Attack against Knowledge Graph Embeddings
    Yang, Guangqian
    Zhang, Lei
    Liu, Yi
    Xie, Hongtao
    Mao, Zhendong
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 19 (01)
  • [34] Using a Pre-Trained Language Model for Medical Named Entity Extraction in Chinese Clinic Text
    Zhang, Mengyuan
    Wang, Jin
    Zhang, Xuejie
    PROCEEDINGS OF 2020 IEEE 10TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2020), 2020, : 312 - 317
  • [35] Using Large Pre-Trained Language Model to Assist FDA in Premarket Medical Device Classification
    Xu, Zongzhe
    SOUTHEASTCON 2023, 2023, : 159 - 166
  • [36] Sentiment Analysis Using Pre-Trained Language Model With No Fine-Tuning and Less Resource
    Kit, Yuheng
    Mokji, Musa Mohd
    IEEE ACCESS, 2022, 10 : 107056 - 107065
  • [37] Robust Sentiment Classification of Metaverse Services Using a Pre-trained Language Model with Soft Voting
    Lee, Haein
    Jung, Hae Sun
    Lee, Seon Hong
    Kim, Jang Hyun
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2023, 17 (09): : 2334 - 2347
  • [38] ClassifAI: Automating Issue Reports Classification using Pre-Trained BERT (Bidirectional Encoder Representations from Transformers) Language Models
    Alam, Khubaib Amjad
    Jumani, Ashish
    Aamir, Harris
    Uzair, Muhammad
    PROCEEDINGS 2024 ACM/IEEE INTERNATIONAL WORKSHOP ON NL-BASED SOFTWARE ENGINEERING, NLBSE 2024, 2024, : 49 - 52
  • [39] A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning
    Kotei, Evans
    Thirunavukarasu, Ramkumar
    INFORMATION, 2023, 14 (03)
  • [40] Automated occupation coding with hierarchical features: a data-centric approach to classification with pre-trained language models
    Safikhani P.
    Avetisyan H.
    Föste-Eggers D.
    Broneske D.
    Discover Artificial Intelligence, 2023, 3 (01):