Event Knowledge in Large Language Models: The Gap Between the Impossible and the Unlikely

被引:17
|
作者
Kauf, Carina [1 ,2 ,8 ]
Ivanova, Anna A. [1 ,2 ,3 ]
Rambelli, Giulia [4 ]
Chersoni, Emmanuele [5 ]
She, Jingyuan Selena [1 ,2 ]
Chowdhury, Zawad [6 ]
Fedorenko, Evelina [1 ,2 ]
Lenci, Alessandro [7 ]
机构
[1] MIT, Dept Brain & Cognit Sci, Cambridge, MA USA
[2] MIT, McGovern Inst Brain Res, Cambridge, MA USA
[3] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA USA
[4] Univ Bologna, Dept Modern Languages Literatures & Cultures, Bologna, Italy
[5] Hong Kong Polytech Univ, Dept Chinese & Bilingual Studies, Hong Kong, Peoples R China
[6] Univ Washington, Dept Math, Seattle, WA USA
[7] Univ Pisa, Dept Philol Literature & Linguist, Pisa, Italy
[8] MIT, Dept Brain & Cognit Sci, 43 Vassar St, Cambridge, MA 02139 USA
关键词
Generalized event knowledge; World knowledge; Plausibility; Typicality; Artificial neural networks; Language models; Syntax; Semantics; EYE-MOVEMENTS; PREDICTION; INTEGRATION; VERBS; REPRESENTATION; PERCEPTION; VIOLATIONS; MEMORY;
D O I
10.1111/cogs.13386
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Word co-occurrence patterns in language corpora contain a surprising amount of conceptual knowledge. Large language models (LLMs), trained to predict words in context, leverage these patterns to achieve impressive performance on diverse semantic tasks requiring world knowledge. An important but understudied question about LLMs' semantic abilities is whether they acquire generalized knowledge of common events. Here, we test whether five pretrained LLMs (from 2018's BERT to 2023's MPT) assign a higher likelihood to plausible descriptions of agent-patient interactions than to minimally different implausible versions of the same event. Using three curated sets of minimal sentence pairs (total n = 1215), we found that pretrained LLMs possess substantial event knowledge, outperforming other distributional language models. In particular, they almost always assign a higher likelihood to possible versus impossible events (The teacher bought the laptop vs. The laptop bought the teacher). However, LLMs show less consistent preferences for likely versus unlikely events (The nanny tutored the boy vs. The boy tutored the nanny). In follow-up analyses, we show that (i) LLM scores are driven by both plausibility and surface-level sentence features, (ii) LLM scores generalize well across syntactic variants (active vs. passive constructions) but less well across semantic variants (synonymous sentences), (iii) some LLM errors mirror human judgment ambiguity, and (iv) sentence plausibility serves as an organizing dimension in internal LLM representations. Overall, our results show that important aspects of event knowledge naturally emerge from distributional linguistic patterns, but also highlight a gap between representations of possible/impossible and likely/unlikely events.
引用
收藏
页数:40
相关论文
共 50 条
  • [1] Dissociating language and thought in large language models
    Mahowald, Kyle
    Ivanova, Anna A.
    Blank, Idan A.
    Kanwisher, Nancy
    Tenenbaum, Joshua B.
    Fedorenko, Evelina
    TRENDS IN COGNITIVE SCIENCES, 2024, 28 (06) : 517 - 540
  • [2] Estimation of gap between current language models and human performance
    Shen, Xiaoyu
    Oualil, Youssef
    Greenberg, Clayton
    Singh, Mittul
    Klakow, Dietrich
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 553 - 557
  • [3] Integrating Knowledge Graph Data with Large Language Models for Explainable Inference
    Efrain Quintero-Narvaez, Carlos
    Monroy, Raul
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 1198 - 1199
  • [4] Incorporating Molecular Knowledge in Large Language Models via Multimodal Modeling
    Yang, Zekun
    Lv, Kun
    Shu, Jian
    Li, Zheng
    Xiao, Ping
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2025,
  • [5] Generalized Event Knowledge Activation During Online Language Comprehension
    Metusalem, Ross
    Kutas, Marta
    Hare, Mary
    McRae, Ken
    Elman, Jeffrey L.
    COGNITION IN FLUX, 2010, : 1058 - 1063
  • [6] Large Language Models are Not Models of Natural Language: They are Corpus Models
    Veres, Csaba
    IEEE ACCESS, 2022, 10 : 61970 - 61979
  • [7] Expecting the unexpected: Examining the interplay between real-world knowledge and contextual cues during language comprehension
    Jiang, Chengjie
    Filik, Ruth
    MEMORY & COGNITION, 2025,
  • [8] Relations Between Language and Cognition: Evidentiality and Sources of Knowledge
    Unal, Ercenur
    Papafragou, Anna
    TOPICS IN COGNITIVE SCIENCE, 2020, 12 (01) : 115 - 135
  • [9] A novel approach to unlocking the synergy of large language models and chemical knowledge in biomedical signal applications
    Yin, Zilong
    Wang, Haoyu
    Chen, Bin
    Sun, Hangling
    Li, Anji
    Zhou, Chenyu
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 103
  • [10] Leveraging Non-Parametric Reasoning With Large Language Models for Enhanced Knowledge Graph Completion
    Zhang, Ying
    Shen, Yangpeng
    Xiao, Gang
    Peng, Jinghui
    IEEE ACCESS, 2024, 12 : 177012 - 177027