Surprisal From Language Models Can Predict ERPs in Processing Predicate-Argument Structures Only if Enriched by an Agent Preference Principle

被引:9
作者
Huber, Eva [1 ,2 ]
Sauppe, Sebastian [1 ,2 ,3 ]
Isasi-Isasmendi, Arrate [1 ,2 ]
Bornkessel-Schlesewsky, Ina [4 ]
Merlo, Paola [5 ,6 ]
Bickel, Balthasar [1 ,2 ]
机构
[1] Univ Zurich, Dept Comparat Language Sci, Zurich, Switzerland
[2] Univ Zurich, Ctr Interdisciplinary Study Language Evolut, Zurich, Switzerland
[3] Univ Zurich, Dept Psychol, Zurich, Switzerland
[4] Univ South Australia, Australian Res Ctr Interact & Virtual Environm, Cognit Neurosci Lab, Adelaide, Australia
[5] Univ Geneva, Dept Linguist, Geneva, Switzerland
[6] Univ Geneva, Univ Ctr Comp Sci, Geneva, Switzerland
来源
NEUROBIOLOGY OF LANGUAGE | 2024年 / 5卷 / 01期
基金
瑞士国家科学基金会; 澳大利亚研究理事会;
关键词
artificial neural networks; computational modeling; event cognition; ERP; sentence processing; surprisal; large language models (LLMs); R PACKAGE; BRAIN; COMPREHENSION; EVENTS; ROLES; ORDER; INFORMATION; REANALYSIS; SPEAKERS; ANIMACY;
D O I
10.1162/nol_a_00121
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
Language models based on artificial neural networks increasingly capture key aspects of how humans process sentences. Most notably, model-based surprisals predict event-related potentials such as N400 amplitudes during parsing. Assuming that these models represent realistic estimates of human linguistic experience, their success in modeling language processing raises the possibility that the human processing system relies on no other principles than the general architecture of language models and on sufficient linguistic input. Here, we test this hypothesis on N400 effects observed during the processing of verb-final sentences in German, Basque, and Hindi. By stacking Bayesian generalised additive models, we show that, in each language, N400 amplitudes and topographies in the region of the verb are best predicted when model-based surprisals are complemented by an Agent Preference principle that transiently interprets initial role-ambiguous noun phrases as agents, leading to reanalysis when this interpretation fails. Our findings demonstrate the need for this principle independently of usage frequencies and structural differences between languages. The principle has an unequal force, however. Compared to surprisal, its effect is weakest in German, stronger in Hindi, and still stronger in Basque. This gradient is correlated with the extent to which grammars allow unmarked NPs to be patients, a structural feature that boosts reanalysis effects. We conclude that language models gain more neurobiological plausibility by incorporating an Agent Preference. Conversely, theories of human processing profit from incorporating surprisal estimates in addition to principles like the Agent Preference, which arguably have distinct evolutionary roots.
引用
收藏
页码:167 / 200
页数:34
相关论文
共 143 条
[51]   Semantic indeterminacy in object relative clauses [J].
Gennari, Silvia P. ;
MacDonald, Maryellen C. .
JOURNAL OF MEMORY AND LANGUAGE, 2008, 58 (02) :161-187
[52]  
Gerwien J., 2016, Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016), P2633
[53]   How Efficiency Shapes Human Language [J].
Gibson, Edward ;
Futrell, Richard ;
Piandadosi, Steven T. ;
Dautriche, Isabelle ;
Mahowald, Kyle ;
Bergen, Leon ;
Levy, Roger .
TRENDS IN COGNITIVE SCIENCES, 2019, 23 (05) :389-407
[54]   The natural order of events:: How speakers of different languages represent events nonverbally [J].
Goldin-Meadow, Susan ;
So, Wing Chee ;
Ozyurek, Ash ;
Mylander, Carolyn .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (27) :9163-9168
[55]   Shared computational principles for language processing in humans and deep language models [J].
Goldstein, Ariel ;
Zada, Zaid ;
Buchnik, Eliav ;
Schain, Mariano ;
Price, Amy ;
Aubrey, Bobbi ;
Nastase, Samuel A. ;
Feder, Amir ;
Emanuel, Dotan ;
Cohen, Alon ;
Jansen, Aren ;
Gazula, Harshvardhan ;
Choe, Gina ;
Rao, Aditi ;
Kim, Catherine ;
Casto, Colton ;
Fanda, Lora ;
Doyle, Werner ;
Friedman, Daniel ;
Dugan, Patricia ;
Melloni, Lucia ;
Reichart, Roi ;
Devore, Sasha ;
Flinker, Adeen ;
Hasenfratz, Liat ;
Levy, Omer ;
Hassidim, Avinatan ;
Brenner, Michael ;
Matias, Yossi ;
Norman, Kenneth A. ;
Devinsky, Orrin ;
Hasson, Uri .
NATURE NEUROSCIENCE, 2022, 25 (03) :369-+
[56]  
Goodkind A., 2018, P CMCL, P10
[57]  
Gulordava K., 2018, P 2018 C N AM CHAPT, V1, P1195, DOI [DOI 10.18653/V1/N18-1108, 10.18653/v1/N18-1108]
[58]   Encoding of event roles from visual scenes is rapid, spontaneous, and interacts with higher-level visual processing [J].
Hafri, Alon ;
Trueswell, John C. ;
Strickland, Brent .
COGNITION, 2018, 175 :36-52
[59]   Getting the Gist of Events: Recognition of Two-Participant Actions From Brief Displays [J].
Hafri, Alon ;
Papafragou, Anna ;
Trueswell, John C. .
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 2013, 142 (03) :880-905
[60]  
Hale J, 2001, 2ND MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P159