Grammatical illusions in BERT: Attraction effects of subject-verb agreement and reflexive-antecedent dependencies

被引:1
作者
Cho, Ye-eun [1 ]
机构
[1] Sungkyunkwan Univ, Dept English Language & Literature, 25 2 Sungkyunkwan Ro, Seoul 03063, South Korea
基金
新加坡国家研究基金会;
关键词
Keywords xAI; DNN language models; BERT; Masked Language Model; cue-combinatoric; scheme; attraction effects; WORKING-MEMORY; TIME-COURSE; INTERFERENCE; COMPREHENSION; RETRIEVAL; CONSTRAINTS; MECHANISMS; RECOVERY;
D O I
10.17250/khisli.40.2.202306.007
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
Cho, Ye-eun. 2023. Grammatical illusions in BERT: Attraction effects of subject-verb agreement and reflexive-antecedent dependencies. Linguistic Research 40(2): 317-352. The phenomenon of attraction effects, whereby a verb erroneously retrieves a syntactically inaccessible but feature-matching noun, is a type of grammatical illusions (Phillips, Wagers, and Lau 2011) that can occur in long-distance subject-verb agreement in human sentence processing (Wagers et al. 2009). In contrast, reflexive-antecedent dependencies have been claimed to lack attraction effects when the reflexive and the antecedent mismatch (Dillon et al. 2013). Yet, some other studies have shown that attraction effects have been observed in reflexive-antecedent dependencies, when the number of feature mismatch between the reflexive and the antecedent increases (Parker and Philips 2017). These findings suggest that there are different cue weightings based on the predictability of the dependency, and these cues are combined according to different cue-combination scheme, such as a linear or a non-linear cue-combination rule (Parker 2019). These linguistic phenomena can be used to analyze how linguistic features are accessed and combined within the internal states of Deep Neural Network (DNN) language models. In the linguistic representations of BERT (Devlin et al. 2018), one of the pre-trained DNN language models, various types of linguistic information are encoded in each layer (Jawahar et al. 2019) and combined while passing through the layers. By measuring the performance of Masked Language Model (MLM), this study finds that both subject-verb agreement and reflexive-antecedent dependencies show attraction effects and follow the linear-combinatoric rule in BERT. The different results from human sentence processing suggest that the self-attention mechanism of BERT may not be able to capture the differences in the predictability of the dependency as effectively as memory retrieval mechanisms in humans. These findings have important implications for developing more understandable and interpretable explainable-AI (xAI) systems that better capture the complexities of human language processing. (Sungkyunkwan University)
引用
收藏
页码:317 / 352
页数:36
相关论文
共 61 条
[1]   Mixed-effects modeling with crossed random effects for subjects and items [J].
Baayen, R. H. ;
Davidson, D. J. ;
Bates, D. M. .
JOURNAL OF MEMORY AND LANGUAGE, 2008, 59 (04) :390-412
[2]  
Bacon Geoff., 2019, ArXiv
[3]   The processing role of structural constraints on the interpretation of pronouns and anaphors [J].
Badecker, W ;
Straub, K .
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-LEARNING MEMORY AND COGNITION, 2002, 28 (04) :748-769
[4]   Exogenous Melatonin Application Delays Senescence and Improves Postharvest Antioxidant Capacity in Blueberries [J].
Li, Jie ;
Wang, Ying ;
Li, Jinying ;
Li, Yanan ;
Lu, Chunze ;
Hou, Zihuan ;
Liu, Haiguang ;
Wu, Lin .
AGRONOMY-BASEL, 2025, 15 (02)
[5]   Analysis Methods in Neural Language Processing: A Survey [J].
Belinkov, Yonatan ;
Glass, James .
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2019, 7 :49-72
[6]   BROKEN AGREEMENT [J].
BOCK, K ;
MILLER, CA .
COGNITIVE PSYCHOLOGY, 1991, 23 (01) :45-93
[7]  
Child R, 2019, Arxiv, DOI arXiv:1904.10509
[8]  
Cho WI, 2021, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, P2922
[9]  
Chomsky N., 1981, LECT GOVT BINDING, DOI DOI 10.1515/9783110884166
[10]  
Correia GM, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P2174