Adding part-of-speech information to the SUBTLEX-US word frequencies

被引:162
作者
Brysbaert, Marc [1 ]
New, Boris [2 ]
Keuleers, Emmanuel [1 ]
机构
[1] Univ Ghent, Dept Expt Psychol, B-9000 Ghent, Belgium
[2] Univ Paris 05, Paris, France
关键词
SUBTLEX; Word frequency; Part-of-speech information; Subtitles; Lexical decision; NOUNS;
D O I
10.3758/s13428-012-0190-4
中图分类号
B841 [心理学研究方法];
学科分类号
040201 ;
摘要
The SUBTLEX-US corpus has been parsed with the CLAWS tagger, so that researchers have information about the possible word classes (parts-of-speech, or PoSs) of the entries. Five new columns have been added to the SUBTLEX-US word frequency list: the dominant (most frequent) PoS for the entry, the frequency of the dominant PoS, the frequency of the dominant PoS relative to the entry's total frequency, all PoSs observed for the entry, and the respective frequencies of these PoSs. Because the current definition of lemma frequency does not seem to provide word recognition researchers with useful information (as illustrated by a comparison of the lemma frequencies and the word form frequencies from the Corpus of Contemporary American English), we have not provided a column with this variable. Instead, we hope that the full list of PoS frequencies will help researchers to collectively determine which combination of frequencies is the most informative.
引用
收藏
页码:991 / 997
页数:7
相关论文
共 24 条
[1]  
[Anonymous], 1993, The CELEX Lexical Database (Release 1) CD-ROM
[2]   Morphological influences on the recognition of monosyllabic monomorphemic words [J].
Baayen, R. H. ;
Feldman, L. B. ;
Schreuder, R. .
JOURNAL OF MEMORY AND LANGUAGE, 2006, 55 (02) :290-313
[3]   Singulars and plurals in Dutch: Evidence for a parallel dual-route modes [J].
Baayen, RH ;
Dijkstra, T ;
Schreuder, R .
JOURNAL OF MEMORY AND LANGUAGE, 1997, 37 (01) :94-117
[4]   The English Lexicon Project [J].
Balota, David A. ;
Yap, Melvin J. ;
Cortese, Michael J. ;
Hutchison, Keith A. ;
Kessler, Brett ;
Loftis, Bjorn ;
Neely, James H. ;
Nelson, Douglas L. ;
Simpson, Greg B. ;
Treiman, Rebecca .
BEHAVIOR RESEARCH METHODS, 2007, 39 (03) :445-459
[5]   The Word Frequency Effect A Review of Recent Developments and Implications for the Choice of Frequency Estimates in German [J].
Brysbaert, Marc ;
Buchmeier, Matthias ;
Conrad, Markus ;
Jacobs, Arthur M. ;
Boelte, Jens ;
Boehl, Andrea .
EXPERIMENTAL PSYCHOLOGY, 2011, 58 (05) :412-424
[6]   Assessing the usefulness of Google Books' word frequencies for psycholinguistic research on word processing [J].
Brysbaert, Marc ;
Keuleers, Emmanuel ;
New, Boris .
FRONTIERS IN PSYCHOLOGY, 2011, 2
[7]   Moving beyond Kucera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English [J].
Brysbaert, Marc ;
New, Boris .
BEHAVIOR RESEARCH METHODS, 2009, 41 (04) :977-990
[8]   SUBTLEX-CH: Chinese Word and Character Frequencies Based on Film Subtitles [J].
Cai, Qing ;
Brysbaert, Marc .
PLOS ONE, 2010, 5 (06)
[9]  
Cuetos F, 2011, PSICOLOGICA, V32, P133
[10]  
Davis M, 2008, The Corpus of Contemporary American English (COCA)