Confidence scoring for accurate HMM-based speech recognition by using monophone-level normalization based on subspace method

被引:0
作者
Ghulam, M [1 ]
Sato, T [1 ]
Fukuda, T [1 ]
Nitta, T [1 ]
机构
[1] Toyohashi Univ Technol, Grad Sch Engn, Toyohashi, Aichi 4418580, Japan
来源
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2003年 / E86D卷 / 03期
关键词
confidence measure (CM); subspace method (SM); feature parameters; normalization;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a novel confidence scoring method that is applied to N-best hypotheses (word candidates) output from an HMM-based classifier is proposed. In the first pass of the proposed method, the HMM-based classifier with monophone models outputs N-best hypotheses and boundaries of all monophones in the hypotheses. In the second pass, an SM (Subspace Method)-based verifier tests the hypotheses by comparing confidence scores. To test the hypotheses, at first, the SM-based verifier calculates the similarity between phone vectors and an eigen vector set of monophones, then this similarity score is converted into a likelihood score with normalization of acoustic quality, and finally, an HMM-based likelihood of word level and an SM-based likelihood of monophone level are combined to formulate the confidence measure. Two kinds of experiments were performed to evaluate this confidence measure on speaker-independent word recognition. The results showed that the proposed confidence scoring method significantly reduced the word error rate from 4.7% obtained by the standard HMM classifier to 2.0%, and in an unknown word rejection, it reduced the equal error rate from 9.0% to 6.5%.
引用
收藏
页码:430 / 437
页数:8
相关论文
共 11 条
  • [1] ASADI R, 1990, P ICASSP 90, P125
  • [2] FUKUDA T, 2001, P ICASSP 01, P129
  • [3] Makino S., 1992, ACOUSTICAL SCI TECHN, V48, P899
  • [4] MATSUURA H, 1993, IEICE T INF SYST D2, V76, P2486
  • [5] Feature extraction for speech recognition based on orthogonal acoustic-feature planes and LDA
    Nitta, T
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 421 - 424
  • [6] NITTA T, 2001, P SPRING M ASJ, P131
  • [7] OJA E, 1983, SUBSPACE METHOD PATT
  • [8] Schaaf T, 1997, INT CONF ACOUST SPEE, P875, DOI 10.1109/ICASSP.1997.596075
  • [9] Vocabulary independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition
    Sukkar, RA
    Lee, CH
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (06): : 420 - 429
  • [10] A SPEAKER-INDEPENDENT CONNECTED DIGIT RECOGNITION SYSTEM CONCATENATING STATISTICALLY DISCRIMINATED WORDS
    UKITA, T
    SAITO, E
    NITTA, T
    WATANABE, S
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1992, 40 (10) : 2414 - 2424