Local similarity and global variability characterize the semantic space of human languages
被引:5
|
作者:
Lewis, Molly
论文数: 0引用数: 0
h-index: 0
机构:
Carnegie Mellon Univ, Psychol & Social & Decis Sci, Pittsburgh, PA 15213 USACarnegie Mellon Univ, Psychol & Social & Decis Sci, Pittsburgh, PA 15213 USA
Lewis, Molly
[1
]
Cahill, Aoife
论文数: 0引用数: 0
h-index: 0
机构:
Dataminr Inc, New York, NY 10016 USACarnegie Mellon Univ, Psychol & Social & Decis Sci, Pittsburgh, PA 15213 USA
Cahill, Aoife
[2
]
Madnani, Nitin
论文数: 0引用数: 0
h-index: 0
机构:
Educ Testing Serv, Princeton, NJ 08541 USACarnegie Mellon Univ, Psychol & Social & Decis Sci, Pittsburgh, PA 15213 USA
Madnani, Nitin
[3
]
Evans, James
论文数: 0引用数: 0
h-index: 0
机构:
Univ Chicago, Sociol & Data Sci, Chicago, IL 60637 USA
Santa Fe Inst, Santa Fe, NM 87501 USACarnegie Mellon Univ, Psychol & Social & Decis Sci, Pittsburgh, PA 15213 USA
Evans, James
[4
,5
]
机构:
[1] Carnegie Mellon Univ, Psychol & Social & Decis Sci, Pittsburgh, PA 15213 USA
[2] Dataminr Inc, New York, NY 10016 USA
[3] Educ Testing Serv, Princeton, NJ 08541 USA
[4] Univ Chicago, Sociol & Data Sci, Chicago, IL 60637 USA
How does meaning vary across the world's languages? Scholars recognize the existence of substantial variability within specific domains, ranging from nature and color to kinship. The emergence of large language models enables a systems-level approach that directly characterizes this variability through comparison of word organization across semantic domains. Here, we show that meanings across languages manifest lower variability within semantic domains and greater variability between them, using models trained on both 1) large corpora of native language text comprising Wikipedia articles in 35 languages and also 2) Test of English as a Foreign Language (TOEFL) essays written by 38,500 speakers from the same native languages, which cluster into semantic domains. Concrete meanings vary less across languages than abstract meanings, but all vary with geographical, environmental, and cultural distance. By simultaneously examining local similarity and global difference, we harmonize these findings and provide a description of general principles that govern variability in semantic space across languages. In this way, the structure of a speaker's semantic space influences the comparisons cognitively salient to them, as shaped by their native language, and suggests that even successful bilingual communicators likely think with "semantic accents" driven by associations from their native language while writing English. These findings have dramatic implications for language education, cross-cultural communication, and literal translations, which are impossible not because the objects of reference are uncertain, but because associations, metaphors, and narratives interlink meanings in different, predictable ways from one language to another.