An application of discriminative feature extraction lo filter-bank-based speech recognition

被引：55

作者：

Biem, A ^{[1
]}

Katagiri, S ^{[1
]}

McDermott, E ^{[1
]}

Juang, BH ^{[1
]}

机构：

[1] ATR, Human Informat Proc Res Labs, Kyoto 61902, Japan

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2001年 / 9卷 / 02期

关键词：

feature extraction; filter-bank; generalized probabilistic descent; minimum classification error; pattern recognition; speech recognition;

D O I：

10.1109/89.902277

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

A pattern recognizer is usually a modular system which consists of a feature extractor module and a classifier module. Traditionally, these two modules have been designed separately, which may not result in an optimal recognition accuracy. To alleviate this fundamental problem, the authors have developed a design method, named Discriminative Feature Extraction (DFE), that enables one to design the overall recognizer, i.e., both the feature extractor and the classifier, in a manner consistent with the objective of minimizing recognition errors. This paper investigates the application of this method to designing a speech recognizer that consists of a filter-bank feature extractor and a multi-prototype distance classifier. Carefully investigated experiments demonstrate that DFE achieves the design of a better recognizer and provides an innovative recognition-oriented analysis of the filter-bank, as an alternative to conventional analysis based on psychoacoustic expertise or heuristics.

引用

页码：96 / 110

页数：15

共 24 条

[1] A THEORY OF ADAPTIVE PATTERN CLASSIFIERS [J].

AMARI, S .

IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (03) :299-+

[2] Pattern recognition using discriminative feature extraction [J].

Biem, A ;

Katagiri, S ;

Juang, BH .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1997, 45 (02) :500-504

[3]

BIEM A, 1997, P IEEE INT C AC SPEE, V2, P1503

[4]

BIEM A, 1993, P IEEE INT C AC SPEE, V2, P275

[5]

Biem A., 1993, P 1993 IEEE WORKSH N, P392

[6]

BIEM A, 1997, THESIS U PARIS 6 PAR

[7] COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].

DAVIS, SB ;

MERMELSTEIN, P .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366

[8]

delaTorre A, 1996, SPEECH COMMUN, V20, P273, DOI 10.1016/S0167-6393(96)00061-1

[9]

Fukunaga K., 1972, Introduction to statistical pattern recognition

[10]

Hart P.E., 1973, Pattern recognition and scene analysis

← 1 2 3 →