Evaluating the comparability of two measures of lexical diversity

被引:14
作者
deBoer, Fredrik [1 ]
机构
[1] Purdue Univ, W Lafayette, IN 47907 USA
关键词
Lexical diversity; vocd; HD-D; Lexicon; Computational linguistics; Applied linguistics; Textual processing;
D O I
10.1016/j.system.2014.10.008
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Language practitioners and others increasingly rely on computerized assessments of large samples of written texts. In order to provide teachers and researchers with useful knowledge, new, more accurate metrics must be developed to aid in these assessments. One common aspect of such assessments is lexical diversity, or the displayed range of diversity in vocabulary. The vocd program and the metric it develops, VOCD-D, have become popular options for researchers attempting to assess lexical diversity. However, researchers have argued that this metric is in fact a complex approximation of a more direct and less variable measure derived from probability sampling, known as HD-D. Using a data set of essays written by Chinese, Japanese, Korean, and native English-speakers drawn from the International Corpus Network of Asian Learners of English, this research investigates that approximation by comparing correlations across L1 and L2 writers. In all cases, the correlations between HD-D and VOCD-D are very high, suggesting that the similarity between these metrics is indeed a product of their statistical mechanisms. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:139 / 145
页数:7
相关论文
共 19 条
[1]  
Broeder P., 1993, FIELD METHOD, V1, P145
[2]  
CHEN YS, 1989, J AM SOC INFORM SCI, V40, P45, DOI 10.1002/(SICI)1097-4571(198901)40:1<45::AID-ASI5>3.0.CO
[3]  
2-S
[4]   Developmental trends in lexical diversity [J].
Durán, P ;
Malvern, D ;
Richards, B ;
Chipere, N .
APPLIED LINGUISTICS, 2004, 25 (02) :220-242
[5]  
Engber C. A., 1995, Journal of Second Language Writing, V4, P139, DOI DOI 10.1016/1060-3743(95)90004-7
[6]   Why most published research findings are false [J].
Ioannidis, JPA .
PLOS MEDICINE, 2005, 2 (08) :696-701
[7]  
Ishikawa S., 2013, Learner Corpus Studies in Asia and The World, V1, P91, DOI 10.24546/81006678
[8]   RETRACTED: Effects of text length on lexical diversity measures: Using short texts with less than 200 tokens (Retracted Article) [J].
Koizumi, Rie ;
In'nami, Yo .
SYSTEM, 2012, 40 (04) :554-564
[9]   The relationship between passive and active vocabularies: Effects of language learning context [J].
Laufer, B ;
Paribakht, TS .
LANGUAGE LEARNING, 1998, 48 (03) :365-391
[10]  
Li Y., 2000, System, V28, P229