Emergent linguistic structure in artificial neural networks trained by self-supervision

被引:144
|
作者
Manning, Christopher D. [1 ]
Clark, Kevin [1 ]
Hewitt, John [1 ]
Khandelwal, Urvashi [1 ]
Levy, Omer [2 ]
机构
[1] Stanford Univ, Comp Sci Dept, Stanford, CA 94305 USA
[2] Facebook Inc, Facebook Artificial Intelligence Res, Seattle, WA 98109 USA
关键词
artificial neural netwok; self-supervision; syntax; learning; LANGUAGE; ACQUISITION;
D O I
10.1073/pnas.1907367117
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper explores the knowledge of linguistic structure learned by large artificial neural networks, trained via self-supervision, whereby the model simply tries to predict a masked word in a given context. Human language communication is via sequences of words, but language understanding requires constructing rich hierarchical structures that are never observed explicitly. The mechanisms for this have been a prime mystery of human language acquisition, while engineering work has mainly proceeded by supervised learning on treebanks of sentences hand labeled for this latent structure. However, we demonstrate that modern deep contextual language models learn major aspects of this structure, without any explicit supervision. We develop methods for identifying linguistic hierarchical structure emergent in artificial neural networks and demonstrate that components in these models focus on syntactic grammatical relationships and anaphoric coreference. Indeed, we show that a linear transformation of learned embeddings in these models captures parse tree distances to a surprising degree, allowing approximate reconstruction of the sentence tree structures normally assumed by linguists. These results help explain why these models have brought such large improvements across many language-understanding tasks.
引用
收藏
页码:30046 / 30054
页数:9
相关论文
共 50 条
  • [1] SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks
    Fatemi, Bahare
    El Asri, Layla
    Kazemi, Seyed Mehran
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [2] Emergent computations in trained artificial neural networks and real brains
    Parga, N.
    Serrano-Fernandez, L.
    Falco-Roget, J.
    JOURNAL OF INSTRUMENTATION, 2023, 18 (02)
  • [3] Hyperspherically regularized networks for self-supervision
    Durrant, Aiden
    Leontidis, Georgios
    IMAGE AND VISION COMPUTING, 2022, 124
  • [4] Hyperspherically regularized networks for self-supervision
    Durrant, Aiden
    Leontidis, Georgios
    Image and Vision Computing, 2022, 124
  • [5] Temporal Transformer Networks With Self-Supervision for Action Recognition
    Zhang, Yongkang
    Li, Jun
    Jiang, Na
    Wu, Guoming
    Zhang, Han
    Shi, Zhiping
    Liu, Zhaoxun
    Wu, Zizhang
    Liu, Xianglong
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (14) : 12999 - 13011
  • [6] Unsupervised Graph Neural Architecture Search with Disentangled Self-supervision
    Zhang, Zeyang
    Wang, Xin
    Zhang, Ziwei
    Shen, Guangyao
    Shen, Shiqi
    Zhu, Wenwu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [7] Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
    Siddhant, Aditya
    Bapna, Ankur
    Cao, Yuan
    Firat, Orhan
    Chen, Mia
    Kudungunta, Sneha
    Arivazhagan, Naveen
    Wu, Yonghui
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2827 - 2835
  • [8] THE RELATIONSHIPS AMONG SELF-SUPERVISION, STRUCTURE, AND TECHNOLOGY IN PROFESSIONAL SERVICE ORGANIZATIONS
    MILLS, PK
    POSNER, BZ
    ACADEMY OF MANAGEMENT JOURNAL, 1982, 25 (02): : 437 - 443
  • [9] Searching for Promisingly Trained Artificial Neural Networks
    Lujano-Rojas, Juan M.
    Dufo-Lopez, Rodolfo
    Artal-Sevil, Jesus Sergio
    Garcia-Paricio, Eduardo
    FORECASTING, 2023, 5 (03): : 550 - 575
  • [10] Self-supervision meets kernel graph neural models: From architecture to augmentations
    Dan, Jiawang
    Wu, Ruofan
    Liu, Yunpeng
    Wang, Baokun
    Meng, Changhua
    Liu, Tengfei
    Zhang, Tianyi
    Wang, Ningtao
    Fu, Xing
    Li, Qi
    Wang, Weiqiang
    2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 1076 - 1083