Some Current Quantitative Problems in Corpus Linguistics and a Sketch of Some Solutions

被引:22
作者
Gries, Stefan Th [1 ]
机构
[1] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
关键词
association measures; mixed-effects; multi-level modeling; MuPDAR; token; type frequencies; variability-based neighbor clustering; CORPORA; MULTIFACTORIAL; LANGUAGE;
D O I
10.1177/1606822X14556606
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper surveys a variety of methodological problems in current quantitative corpus linguistics. Some problems discussed are from corpus linguistics in general, such as the impact that dispersion, type frequencies/entropies, and directionality (should) have on the computation of association measures as well as the impact that neglecting the sampling structure of a corpus can have on a statistical analysis. Others involve more specialized areas in which corpus-linguistic work is currently booming, such as historical linguistics and learner corpus research. For each of the problems, first ideas/pointers as to how these problems can be resolved are provided and exemplified in some detail.
引用
收藏
页码:93 / 117
页数:25
相关论文
共 43 条
  • [1] Aijmer K., 2002, COMPUTER LEARNER COR, P55, DOI [10.1075/lllt.6.07aij, DOI 10.1075/LLLT.6.07AIJ]
  • [2] Altenberg B., 2002, Computer learner corpora, second language acquisition and foreign language teaching, P37
  • [3] [Anonymous], CORPORA IN PRESS
  • [4] [Anonymous], COGNITIVE LINGUSTICS
  • [5] [Anonymous], PHRAS 2005 OCT 13 15
  • [6] [Anonymous], UWM LING S FORM LANG
  • [7] [Anonymous], NEW APPROACHES EXTRA
  • [8] [Anonymous], 2007, INT C REC ADV NAT LA
  • [9] [Anonymous], CORP LING 2009 JUL 2
  • [10] A real experiment is a factorial experiment?
    Baayen, R. Harald
    [J]. MENTAL LEXICON, 2010, 5 (01) : 149 - 157