Enhancing the magnitude spectrum of speech features for robust speech recognition

被引：1

作者：

Hung, Jeih-weih ^{[1
]}

Fan, Hao-teng ^{[1
]}

Tu, Wen-hsiang ^{[1
]}

机构：

[1] Natl Chi Nan Univ, Dept Elect Engn, Puli, Taiwan

来源：

EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING | 2012年

关键词：

Voice activity detection; Robust speech recognition; Speech enhancement; NOISE; COMPENSATION;

D O I：

10.1186/1687-6180-2012-189

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this article, we present an effective compensation scheme to improve noise robustness for the spectra of speech signals. In this compensation scheme, called magnitude spectrum enhancement (MSE), a voice activity detection (VAD) process is performed on the frame sequence of the utterance. The magnitude spectra of non-speech frames are then reduced while those of speech frames are amplified. In experiments conducted on the Aurora-2 noisy digits database, MSE achieves an error reduction rate of nearly 42% relative to baseline processing. This method outperforms well-known spectral-domain speech enhancement techniques, including spectral subtraction (SS) and Wiener filtering (WF). In addition, the proposed MSE can be integrated with cepstral-domain robustness methods, such as mean and variance normalization (MVN) and histogram normalization (HEQ), to achieve further improvements in recognition accuracy under noise-corrupted environments.

引用

页数：20

共 42 条

[1] Acero A, 1990, THESIS
[2] [Anonymous], 2000, INTERSPEECH, DOI DOI 10.1016/S0167-6393(03)00016-5
[3] [Anonymous], 1996, ITU RECOMMENDATION G
[4] EFFECTIVENESS OF LINEAR PREDICTION CHARACTERISTICS OF SPEECH WAVE FOR AUTOMATIC SPEAKER IDENTIFICATION AND VERIFICATION
ATAL, BS
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 55 (06) : 1304 - 1312
[5] Likelihood-Maximizing-Based Multiband Spectral Subtraction for Robust Speech Recognition
BabaAli, Bagher
Sameti, Hossein
Safayani, Mehran
[J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2009,
[6] Berouti M., 1979, ICASSP 79. 1979 IEEE International Conference on Acoustics, Speech and Signal Processing, P208
[7] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION
BOLL, SF
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02): : 113 - 120
[8] MVA processing of speech features
Chen, Chia-Ping
Bilmes, Jeff A.
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01): : 257 - 270
[9] Chu KK, 2004, 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS, P973
[10] Deng L, 2001, INT CONF ACOUST SPEE, P301, DOI 10.1109/ICASSP.2001.940827

← 1 2 3 4 5 →