Enhancing the magnitude spectrum of speech features for robust speech recognition

被引:1
作者
Hung, Jeih-weih [1 ]
Fan, Hao-teng [1 ]
Tu, Wen-hsiang [1 ]
机构
[1] Natl Chi Nan Univ, Dept Elect Engn, Puli, Taiwan
来源
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING | 2012年
关键词
Voice activity detection; Robust speech recognition; Speech enhancement; NOISE; COMPENSATION;
D O I
10.1186/1687-6180-2012-189
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this article, we present an effective compensation scheme to improve noise robustness for the spectra of speech signals. In this compensation scheme, called magnitude spectrum enhancement (MSE), a voice activity detection (VAD) process is performed on the frame sequence of the utterance. The magnitude spectra of non-speech frames are then reduced while those of speech frames are amplified. In experiments conducted on the Aurora-2 noisy digits database, MSE achieves an error reduction rate of nearly 42% relative to baseline processing. This method outperforms well-known spectral-domain speech enhancement techniques, including spectral subtraction (SS) and Wiener filtering (WF). In addition, the proposed MSE can be integrated with cepstral-domain robustness methods, such as mean and variance normalization (MVN) and histogram normalization (HEQ), to achieve further improvements in recognition accuracy under noise-corrupted environments.
引用
收藏
页数:20
相关论文
共 42 条
  • [1] Acero A, 1990, THESIS
  • [2] [Anonymous], 2000, INTERSPEECH, DOI DOI 10.1016/S0167-6393(03)00016-5
  • [3] [Anonymous], 1996, ITU RECOMMENDATION G
  • [4] EFFECTIVENESS OF LINEAR PREDICTION CHARACTERISTICS OF SPEECH WAVE FOR AUTOMATIC SPEAKER IDENTIFICATION AND VERIFICATION
    ATAL, BS
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 55 (06) : 1304 - 1312
  • [5] Likelihood-Maximizing-Based Multiband Spectral Subtraction for Robust Speech Recognition
    BabaAli, Bagher
    Sameti, Hossein
    Safayani, Mehran
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2009,
  • [6] Berouti M., 1979, ICASSP 79. 1979 IEEE International Conference on Acoustics, Speech and Signal Processing, P208
  • [7] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION
    BOLL, SF
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02): : 113 - 120
  • [8] MVA processing of speech features
    Chen, Chia-Ping
    Bilmes, Jeff A.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01): : 257 - 270
  • [9] Chu KK, 2004, 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS, P973
  • [10] Deng L, 2001, INT CONF ACOUST SPEE, P301, DOI 10.1109/ICASSP.2001.940827