Extracting phonetic knowledge from learning systems: Perceptrons, support vector machines and linear discriminants

被引:7
作者
Damper, RI [1 ]
Gunn, SR [1 ]
Gore, MO [1 ]
机构
[1] Univ Southampton, Dept Elect & Comp Sci, Image Speech & Intelligent Syst Res Grp, Southampton SO17 1BJ, Hants, England
关键词
speech perception; auditory processing; perceptrons; support vector machines; linear discriminant analysis;
D O I
10.1023/A:1008359903796
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech perception relies on the human ability to decode continuous, analogue sound pressure waves into discrete, symbolic labels ('phonemes') with linguistic meaning. Aspects of this signal-to-symbol transformation have been intensively studied over many decades, using psychophysical procedures. The perception of (synthetic) syllable-initial stop consonants has been especially well studied, since these sounds display a marked categorization effect: they are typically dichotomised into 'voiced' and 'unvoiced' classes according to their voice onset time (VOT). In this case, the category boundary is found to have a systematic relation to the (simulated) place of articulation, but there is no currently-accepted explanation of this phenomenon. Categorization effects have now been demonstrated in a variety of animal species as well as humans, indicating that their origins lie in general auditory and/or learning mechanisms, rather than in some 'phonetic module' specialized to human speech processing. In recent work, we have demonstrated that appropriately-trained computational learning systems ('neural networks') also display the same systematic behaviour as human and animal listeners. Networks are trained on simulated patterns of auditory-nerve firings in response to synthetic 'continuua' of stop-consonant/vowel syllables varying in place of articulation and VOT. Unlike real listeners, such a software model is amenable to analysis aimed at extracting the phonetic knowledge acquired in training, so providing a putative explanation of the categorization phenomenon. Here, we study three learning systems: single-layer perceptrons, support vector machines and Fisher linear discriminants. We highlight similarities and differences between these approaches. We find that the modern inductive inference technique for small sample sizes of support vector machines gives the most convincing results. Knowledge extracted from the trained machine indicated that the phonetic percept of voicing is easily and directly recoverable from auditory (but not acoustic) representations.
引用
收藏
页码:43 / 62
页数:20
相关论文
共 51 条
[1]  
Abramson A.S., 1970, P 6 INT C PHON SCI P, P569
[2]   COGNITIVE AND PSYCHOLOGICAL COMPUTATION WITH NEURAL MODELS [J].
ANDERSON, JA .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05) :799-815
[3]   DISTINCTIVE FEATURES, CATEGORICAL PERCEPTION, AND PROBABILITY-LEARNING - SOME APPLICATIONS OF A NEURAL MODEL [J].
ANDERSON, JA ;
SILVERSTEIN, JW ;
RITZ, SA ;
JONES, RS .
PSYCHOLOGICAL REVIEW, 1977, 84 (05) :413-451
[4]  
[Anonymous], CATEGORICAL PERCEPTI
[5]  
[Anonymous], VG1196G4 CORN AER LA
[6]   What Size Net Gives Valid Generalization? [J].
Baum, Eric B. ;
Haussler, David .
NEURAL COMPUTATION, 1989, 1 (01) :151-160
[7]  
Bose N.K., 1996, Neural Network Fundamentals with Graphs, Algorithms, and Applications
[8]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
[9]   CONNECTIONISM, COMPETENCE, AND EXPLANATION [J].
CLARK, A .
BRITISH JOURNAL FOR THE PHILOSOPHY OF SCIENCE, 1990, 41 (02) :195-222
[10]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411