Learning Diphone-Based Segmentation

被引:47
作者
Daland, Robert [1 ]
Pierrehumbert, Janet B. [2 ]
机构
[1] Univ Calif Los Angeles, Dept Linguist, Los Angeles, CA 90095 USA
[2] Northwestern Univ, Dept Linguist, Evanston, IL 60208 USA
关键词
Language acquisition; Word segmentation; Bayesian; Unsupervised learning; Computational model; INFANTS SENSITIVITY; WORD BOUNDARIES; PERCEPTUAL REORGANIZATION; PHONOTACTIC PROBABILITY; LINGUISTIC EXPERIENCE; PHONETIC PERCEPTION; SPEECH CONTRASTS; MODEL; CUES; DISCRIMINATION;
D O I
10.1111/j.1551-6709.2010.01160.x
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
This paper reconsiders the diphone-based word segmentation model of Cairns, Shillcock, Chater, and Levy (1997) and Hockema (2006), previously thought to be unlearnable. A statistically principled learning model is developed using Bayes' theorem and reasonable assumptions about infants' implicit knowledge. The ability to recover phrase-medial word boundaries is tested using phonetic corpora derived from spontaneous interactions with children and adults. The (unsupervised and semi-supervised) learning models are shown to exhibit several crucial properties. First, only a small amount of language exposure is required to achieve the model's ceiling performance, equivalent to between 1 day and 1 month of caregiver input. Second, the models are robust to variation, both in the free parameter and the input representation. Finally, both the learning and baseline models exhibit undersegmentation, argued to have significant ramifications for speech processing as a whole.
引用
收藏
页码:119 / 155
页数:37
相关论文
共 99 条
[1]   Young children's productivity with word order and verb morphology [J].
Akhtar, N ;
Tomasello, M .
DEVELOPMENTAL PSYCHOLOGY, 1997, 33 (06) :952-965
[2]   Rules vs. analogy in English past tenses: a computational/experimental study [J].
Albright, A ;
Hayes, B .
COGNITION, 2003, 90 (02) :119-161
[3]   Feature-based generalisation as a source of gradient acceptability [J].
Albright, Adam .
PHONOLOGY, 2009, 26 (01) :9-41
[4]   A statistical basis for speech sound discrimination [J].
Anderson, JL ;
Morgan, JL ;
White, KS .
LANGUAGE AND SPEECH, 2003, 46 :155-182
[5]  
[Anonymous], THESIS BROWN U
[6]  
[Anonymous], J ACOUSTICAL SOC AM
[7]  
[Anonymous], 2000, The CHILDES project: The database
[8]  
Arnon I., 2009, 83 ANN M LING SOC AM
[9]  
Arnon I., 2009, 33 BOST U C LANG DEV
[10]   Computation of conditional probability statistics by 8-month-old infants [J].
Aslin, RN ;
Saffran, JR ;
Newport, EL .
PSYCHOLOGICAL SCIENCE, 1998, 9 (04) :321-324