A partitioned neural network approach for vowel classification using smoothed time frequency features

被引:19
作者
Zahorian, SA [1 ]
Nossair, ZB [1 ]
机构
[1] Old Dominion Univ, Dept Elect & Comp Engn, Norfolk, VA 23529 USA
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1999年 / 7卷 / 04期
基金
美国国家科学基金会;
关键词
classifier improvement and comparisons; feature extraction;
D O I
10.1109/89.771263
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A novel pattern classification technique and a new feature extraction method are described and tested for vowel classification. The pattern classification technique partitions an N-way classification task into N*(N-1)/2 two-way classification tasks. Each two-way classification task is performed using a neural network classifier that is trained to discriminate the two members of one pair of categories. Multiple two-way classification decisions are then combined to form an nr-way decision. Some of the advantages of the new classification approach include the partitioning of the task allowing independent feature and classifier optimization for each pair of categories, lowered sensitivity of classification performance on network parameters, a reduction in the amount of training data required, and potential for superior performance relative to a single large network. The features described in this paper, closely related to the cepstral coefficients and delta cepstra commonly used in speech analysis, are developed using a unified mathematical framework which allows arbitrary nonlinear frequency, amplitude, and time scales to compactly represent the spectral/temporal characteristics of speech. This classification approach, combined with a feature-ranking algorithm which selected the 35 most discriminative spectral/temporal features for each vowel pair, resulted in 71.5% accuracy for classification of 16 vowels extracted from the TIMIT database. These results, significantly higher than other published results for the same task, illustrate the potential for the methods presented in this paper.
引用
收藏
页码:414 / 425
页数:12
相关论文
共 24 条
[1]  
COLE RA, P ICSLP92, P1091
[2]  
DUDA RO, 1973, PATTERN ANAL SCENE C
[3]  
ELJAROUDI A, P IJCNN90, P185
[4]  
GISH H, P ICASSP93, P447
[5]  
GISH H, P ICASSP 90, P1361
[6]  
GOLDENTHAL WD, 1994, THESIS MIT CAMBRIDGE
[7]  
GOLDENTHAL WD, P EUROSPEECH 93, P289
[8]  
Hastie T, 1998, ANN STAT, V26, P451
[9]  
LEUNG H, P ICASSP90, P525
[10]  
LEUNG H, P ICASSP 88, P422