Determining the Importance of Frequency and Contextual Diversity in the Lexical Organization of Multiword Expressions

被引:10
作者
Senaldi, Marco S. G. [1 ]
Titone, Debra A. [1 ]
Johns, Brendan T. [1 ]
机构
[1] McGill Univ, Dept Psychol, 2001 McGill Coll Ave, Montreal, PQ H3A 1G1, Canada
来源
CANADIAN JOURNAL OF EXPERIMENTAL PSYCHOLOGY-REVUE CANADIENNE DE PSYCHOLOGIE EXPERIMENTALE | 2022年 / 76卷 / 02期
基金
加拿大自然科学与工程研究理事会;
关键词
lexical organization; semantic diversity; idioms; multiword expressions; distributional semantics; WORD-FREQUENCY; IDIOMATIC EXPRESSIONS; SEMANTIC DIVERSITY; LANGUAGE; COMPREHENSION; INFORMATION; FAMILIARITY; CHILDREN; ACCESS;
D O I
10.1037/cep0000271
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Corpus-based models of lexical strength have called into question the role of word frequency as an organizing principle of the lexicon, revealing that contextual and semantic diversity measures provide a closer fit to lexical behavior data (Adelman et al., 2006; Jones et al., 2012). Contextual diversity measures modify word frequency by ignoring word repetition in context, while semantic diversity measures consider the semantic consistency of contextual word occurrence. Recent research has shown that a better account of lexical organization data is provided by socially based measures of semantic diversity, which encode the communication patterns of individuals across discourses (Johns, 2021b). While most research on contextual diversity has focused on single words, recent corpus-based and experimental evidence suggests that an integral part of language use involves recurrent and more structurally complex units, such as multiword phrases and idioms. The aim of the present work was to determine if contextual and semantic diversity drive lexical organization at the level of multiword units (here, operationalized as idiomatic expressions), in addition to single words. To this end, we analyzed normative ratings of familiarity for 210 English idioms (Libben & Titone, 2008) using a set of contextual, semantic, and socially based diversity measures that were computed from a 55-billion word corpus of Reddit comments. The results confirm the superiority of diversity measures over frequency for multiword expressions, suggesting that multiword units, such as idiomatic phrases, show similar lexical organization dynamics as single words.
引用
收藏
页码:87 / 98
页数:12
相关论文
共 72 条