Confidence scoring for accurate HMM-based speech recognition by using monophone-level normalization based on subspace method

被引：0

作者：

Ghulam, M ^{[1
]}

Sato, T ^{[1
]}

Fukuda, T ^{[1
]}

Nitta, T ^{[1
]}

机构：

[1] Toyohashi Univ Technol, Grad Sch Engn, Toyohashi, Aichi 4418580, Japan

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2003年 / E86D卷 / 03期

关键词：

confidence measure (CM); subspace method (SM); feature parameters; normalization;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, a novel confidence scoring method that is applied to N-best hypotheses (word candidates) output from an HMM-based classifier is proposed. In the first pass of the proposed method, the HMM-based classifier with monophone models outputs N-best hypotheses and boundaries of all monophones in the hypotheses. In the second pass, an SM (Subspace Method)-based verifier tests the hypotheses by comparing confidence scores. To test the hypotheses, at first, the SM-based verifier calculates the similarity between phone vectors and an eigen vector set of monophones, then this similarity score is converted into a likelihood score with normalization of acoustic quality, and finally, an HMM-based likelihood of word level and an SM-based likelihood of monophone level are combined to formulate the confidence measure. Two kinds of experiments were performed to evaluate this confidence measure on speaker-independent word recognition. The results showed that the proposed confidence scoring method significantly reduced the word error rate from 4.7% obtained by the standard HMM classifier to 2.0%, and in an unknown word rejection, it reduced the equal error rate from 9.0% to 6.5%.

引用

页码：430 / 437

页数：8

共 11 条

[1] ASADI R, 1990, P ICASSP 90, P125
[2] FUKUDA T, 2001, P ICASSP 01, P129
[3] Makino S., 1992, ACOUSTICAL SCI TECHN, V48, P899
[4] MATSUURA H, 1993, IEICE T INF SYST D2, V76, P2486
[5] Feature extraction for speech recognition based on orthogonal acoustic-feature planes and LDA
Nitta, T
[J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 421 - 424
[6] NITTA T, 2001, P SPRING M ASJ, P131
[7] OJA E, 1983, SUBSPACE METHOD PATT
[8] Schaaf T, 1997, INT CONF ACOUST SPEE, P875, DOI 10.1109/ICASSP.1997.596075
[9] Vocabulary independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition
Sukkar, RA
Lee, CH
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (06): : 420 - 429
[10] A SPEAKER-INDEPENDENT CONNECTED DIGIT RECOGNITION SYSTEM CONCATENATING STATISTICALLY DISCRIMINATED WORDS
UKITA, T
SAITO, E
NITTA, T
WATANABE, S
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1992, 40 (10) : 2414 - 2424

← 1 2 →