The Flickr frequency norms: What 17 years of images tagged online tell us about lexical processing

被引:4
作者
Petilli, Marco A. [1 ]
Gunther, Fritz [2 ]
Marelli, Marco [1 ,3 ]
机构
[1] Univ Milano Bicocca, Milan, Italy
[2] Univ Tubingen, Tubingen, Germany
[3] Milan Ctr Neurosci, NeuroMI, Milan, Italy
关键词
Flickr frequency; Word frequency; Visual strength; Concreteness; Imageability; Flickr; WORD-FREQUENCY; DECISION DATA; LANGUAGE; CONCRETENESS; ACQUISITION; RATINGS; PERCEPTION; DATABASE;
D O I
10.3758/s13428-022-02031-y
中图分类号
B841 [心理学研究方法];
学科分类号
040201 ;
摘要
Word frequency is one of the best predictors of language processing. Typically, word frequency norms are entirely based on natural-language text data, thus representing what the literature typically refers to as purely linguistic experience. This study presents Flickr frequency norms as a novel word frequency measure from a domain-specific corpus inherently tied to extra-linguistic information: words used as image tags on social media. To obtain Flickr frequency measures, we exploited the photo-sharing platform Flickr Image (containing billions of photos) and extracted the number of uploaded images tagged with each of the words considered in the lexicon. Here, we systematically examine the peculiarities of Flickr frequency norms and show that Flickr frequency is a hybrid metrics, lying at the intersection between language and visual experience and with specific biases induced by being based on image-focused social media. Moreover, regression analyses indicate that Flickr frequency captures additional information beyond what is already encoded in existing norms of linguistic, sensorimotor, and affective experience. Therefore, these new norms capture aspects of language usage that are missing from traditional frequency measures: a portion of language usage capturing the interplay between language and vision, which - this study demonstrates - has its own impact on word processing. The Flickr frequency norms are openly available on the Open Science Framework (https://osf.io/2zfs3/).
引用
收藏
页码:126 / 147
页数:22
相关论文
共 86 条
  • [1] Ames M, 2007, CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1 AND 2, P971
  • [2] Reading visually embodied meaning from the brain: Visually grounded computational models decode visual-object mental imagery induced by written text
    Anderson, Andrew James
    Bruni, Elia
    Lopopolo, Alessandro
    Poesio, Massimo
    Baroni, Marco
    [J]. NEUROIMAGE, 2015, 120 : 309 - 322
  • [3] [Anonymous], 2012, LEXICAL RESOURCES PS
  • [4] Frequency in lexical processing
    Baayen, R. Harald
    Milin, Petar
    Ramscar, Michael
    [J]. APHASIOLOGY, 2016, 30 (11) : 1174 - 1220
  • [5] Baayen RH, 2010, INT J PSYCHOL RES, V3, P12
  • [6] Baayen R. Harald., 1996, The CELEX lexical database CD-ROM
  • [7] The English Lexicon Project
    Balota, David A.
    Yap, Melvin J.
    Cortese, Michael J.
    Hutchison, Keith A.
    Kessler, Brett
    Loftis, Bjorn
    Neely, James H.
    Nelson, Douglas L.
    Simpson, Greg B.
    Treiman, Rebecca
    [J]. BEHAVIOR RESEARCH METHODS, 2007, 39 (03) : 445 - 459
  • [8] Baroni M, 2014, PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P238
  • [9] Distributional Memory: A General Framework for Corpus-Based Semantics
    Baroni, Marco
    Lenci, Alessandro
    [J]. COMPUTATIONAL LINGUISTICS, 2010, 36 (04) : 673 - 721
  • [10] Timed picture naming in seven languages
    Bates, E
    D'Amico, S
    Jacobsen, T
    Székely, A
    Andonova, E
    Devescovi, A
    Herron, D
    Lu, CC
    Pechmann, T
    Pléh, C
    Wicha, N
    Federmeier, K
    Gerdjikova, I
    Gutierrez, G
    Hung, D
    Hsu, J
    Iyer, G
    Kohnert, K
    Mehotcheva, T
    Orozco-Figueroa, A
    Tzeng, A
    Tzeng, O
    [J]. PSYCHONOMIC BULLETIN & REVIEW, 2003, 10 (02) : 344 - 380