Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments

被引:36
|
作者
Kim, HK [1 ]
Rose, RC
机构
[1] Kwangju Inst Sci & Technol, Dept Informat & Commun, Kwangju 500712, South Korea
[2] AT&T Labs Res, Florham Pk, NJ 07932 USA
来源
关键词
acoustic feature compensation; cepstrum compensation; noise-robust front-end; speech enhancement; speech recognition;
D O I
10.1109/TSA.2003.815515
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a set of acoustic feature pre-processing techniques that are applied to improving automatic speech recognition (ASR) performance on noisy speech recognition tasks. The principal contribution of this paper is an approach for cepstrum-domain feature compensation in ASR which is motivated by techniques for decomposing speech and noise that were originally developed for noisy speech enhancement. This approach is applied in combination with other feature compensation algorithms to compensating ASR features obtained from a mel-filterbank cepstrum coefficient front-end. Performance comparisons are made with respect to the application of the minimum mean squared error log spectral amplitude (MMSE-LSA) estimator based speech enhancement algorithm prior to feature analysis. An experimental study is presented where the feature compensation approaches described in the paper are found to greatly reduce ASR word error rate compared to uncompensated features under environmental and channel mismatched conditions.
引用
收藏
页码:435 / 446
页数:12
相关论文
共 44 条
  • [1] Cepstrum-Domain Model Combination Based on Decomposition of Speech and Noise Using MMSE-LSA for ASR in Noisy Environments
    Kim, Hong Kook
    Rose, Richard C.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (04): : 704 - 713
  • [2] Cepstrum-domain model combination based on decomposition of speech and noise for noisy speech recognition
    Kim, HK
    Rose, RC
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 209 - 212
  • [3] Speech enhancement method based on feature compensation gain for effective speech recognition in noisy environments
    Bae, Ara
    Kim, Wooil
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2019, 38 (01): : 51 - 55
  • [4] Feature domain compensation of nonstationary noise for robust speech recognition
    Kim, NS
    SPEECH COMMUNICATION, 2002, 37 (3-4) : 231 - 248
  • [5] Speech Feature Compensation Based on Pseudo Stereo Codebooks for Robust Speech Recognition in Additive Noise Environments
    Hsieh, Tsung-hsueh
    Hung, Jeih-weih
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2400 - 2403
  • [6] Nonlinear noise compensation in feature domain for speech recognition with numerical methods
    Jiang, H
    Wang, Q
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 985 - 988
  • [7] Speech Stream Detection for Noisy Environments Based on Empirical Mode Decomposition
    Tang Qiang
    Zhang Dexiang
    Yan Qing
    ADVANCED DESIGN AND MANUFACTURING TECHNOLOGY III, PTS 1-4, 2013, 397-400 : 2239 - +
  • [8] A cepstrum domain HMM-based speech enhancement method applied to nonstationary noise
    Nilsson, M
    Dahl, M
    Claesson, I
    SIGNAL PROCESSING FOR TELECOMMUNICATIONS AND MULTIMEDIA, 2005, 27 : 1 - 13
  • [9] A VTS-based Feature Compensation Method using Noisy Speech HMMs
    Chung, Yongjoo
    APPLIED MATHEMATICS & INFORMATION SCIENCES, 2014, 8 (06): : 2849 - 2856
  • [10] An RNN-based noise estimation and likelihood compensation for noisy speech recognition
    Hong, WT
    Chen, SH
    NEURAL NETWORKS FOR SIGNAL PROCESSING VI, 1996, : 293 - 301