Learning phonetic categories by tracking movements

被引:34
作者
Gauthier, Bruno
Shi, Rushen
Xu, Yi
机构
[1] Univ Quebec, Dept Psychol, Montreal, PQ H3C 3P8, Canada
[2] UCL, Dept Phonet & Linguist, London WC1E 6BT, England
关键词
category formation; infant speech perception; language acquisition; unsupervised learning; self-organizing maps; target approximation; lexical tone; contextual tonal variation; theories of speech production and perception; LANGUAGE SPEECH-PERCEPTION; STOP-CONSONANT PERCEPTION; TONE PERCEPTION; MANDARIN CHINESE; MAXIMUM SPEED; LEXICAL TONE; PITCH; COARTICULATION; INFORMATION; REALIZATION;
D O I
10.1016/j.cognition.2006.03.002
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
We explore in this study how infants may derive phonetic categories from adult input that are highly variable. Neural networks in the form of self-organizing maps (SOMs; Kohonen, 1989, 1995) were used to simulate unsupervised learning of Mandarin tones. In Simulation 1, we trained the SOMs with syllable-sized continuous F-0 contours, produced by multiple speakers in connected speech, and with the corresponding velocity profiles (D1). No attempt was made to reduce the large amount of variability in the input or to add to the input any abstract features such as height and slope of the F-0 contours. In the testing phase, reasonably high categorization rate was achieved with F-0 profiles, but D1 profiles yielded almost perfect categorization of the four tones. Close inspection of the learned prototypical D1 profile clusters revealed that they had effectively eliminated surface variability and directly reflected articulatory movements toward the underlying targets of the four tones as proposed by Xu and Wang (2001). Additional simulations indicated that a further learning step was possible through which D1 prototypes with one-to-one correspondence to the tones were derived from the prototype clusters learned in Simulation 1. Implications of these findings for theories of language acquisition, speech perception and speech production are discussed. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:80 / 106
页数:27
相关论文
共 50 条
[41]   Lexically-guided perceptual learning does generalize to new phonetic contexts [J].
Nelson, Scott ;
Durvasula, Karthik .
JOURNAL OF PHONETICS, 2021, 84
[42]   The use of eye movements in the study of multimedia learning [J].
Hyona, Jukka .
LEARNING AND INSTRUCTION, 2010, 20 (02) :172-176
[43]   Tracking hand movements captures the response dynamics of the evaluative priming effect [J].
Kawakami, Naoaki ;
Miura, Emi .
COGNITION & EMOTION, 2019, 33 (03) :452-465
[44]   Role of motor execution in the ocular tracking of self-generated movements [J].
Chen, Jing ;
Valsecchi, Matteo ;
Gegenfurtner, Karl R. .
JOURNAL OF NEUROPHYSIOLOGY, 2016, 116 (06) :2586-2593
[45]   What are the letters of speech? Testing the role of phonological specification and phonetic similarity in perceptual learning [J].
Mitterer, Holger ;
Cho, Taehong ;
Kim, Sahyang .
JOURNAL OF PHONETICS, 2016, 56 :110-123
[46]   Learning to recognize unfamiliar faces from fine-phonetic detail in visual speech [J].
Jesse, Alexandra .
ATTENTION PERCEPTION & PSYCHOPHYSICS, 2025, 87 (03) :936-951
[47]   Learning multiple distributed prototypes of semantic categories for named entity recognition [J].
Henriksson, Aron .
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 13 (04) :395-411
[48]   Unsupervised learning of vowel categories from infant-directed speech [J].
Vallabha, Gautam K. ;
McClelland, James L. ;
Pons, Ferran ;
Werker, Janet F. ;
Amano, Shigeaki .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (33) :13273-13278
[49]   Unsupervised Learning of Human Action Categories in Still Images with Deep Representations [J].
Zheng, Yunpeng ;
Li, Xuelong ;
Lu, Xiaoqiang .
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2019, 15 (04)
[50]   Acquisition of colour categories through learning: Differences between hue and lightness [J].
Martinovic, Jasna .
COGNITION, 2024, 242