The Corpus of Contemporary American English as the first reliable monitor corpus of English

被引:268
作者
Davies, Mark [1 ]
机构
[1] Brigham Young Univ, Provo, UT 84602 USA
来源
LITERARY AND LINGUISTIC COMPUTING | 2010年 / 25卷 / 04期
关键词
VERBS;
D O I
10.1093/llc/fqq018
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
The Corpus of Contemporary American English is the first large, genre-balanced corpus of any language, which has been designed and constructed from the ground up as a 'monitor corpus', and which can be used to accurately track and study recent changes in the language. The 400 million words corpus is evenly divided between spoken, fiction, popular magazines, newspapers, and academic journals. Most importantly, the genre balance stays almost exactly the same from year to year, which allows it to accurately model changes in the 'real world'. After discussing the corpus design, we provide a number of concrete examples of how the corpus can be used to look at recent changes in English, including morphology (new suffixes -friendly and -gate), syntax (including prescriptive rules, quotative like, so not ADJ, the get passive, resultatives, and verb complementation), semantics (such as changes in meaning with web, green, or gay), and lexis-including word and phrase frequency by year, and using the corpus architecture to produce lists of all words that have had large shifts in frequency between specific historical periods.
引用
收藏
页码:447 / 464
页数:18
相关论文
共 26 条
[1]  
BAKER D, 2006, GLOSSARY CORPUS LING
[2]   Quotative be like in American English Ephemeral or here to stay? [J].
Barbieri, Federica .
ENGLISH WORLD-WIDE, 2009, 30 (01) :68-90
[3]  
Biber D., 1999, LONGMAN GRAMMAR SPOK
[4]   Localized globalization: A multi-local, multivariate investigation of quotative be like [J].
Buchstaller, Isabelle ;
D'Arcy, Alexandra .
JOURNAL OF SOCIOLINGUISTICS, 2009, 13 (03) :291-331
[5]  
Burnard L, 2002, LANG COMPUT, P51
[6]   NOTES ON THE SPLIT INFINITIVE [J].
CLOSE, RA .
JOURNAL OF ENGLISH LINGUISTICS, 1987, 20 (02) :217-229
[7]  
DAVIES M, 2005, INT J CORPUS LINGUIS, V10, P301
[8]  
DAVIES M, 2009, WHATS WORDLIST INVES, P53
[9]   The 385+million word Corpus of Contemporary American English (1990-2008+) Design, architecture, and linguistic insights [J].
Davies, Mark .
INTERNATIONAL JOURNAL OF CORPUS LINGUISTICS, 2009, 14 (02) :159-190
[10]  
FACCHINETTI R, 2000, CORPUS LINGUIST LING, P117