Noisy speech recognition using de-noised multiresolution analysis acoustic features

被引：10

作者：

Chan, CP ^{[1
]}

Ching, PC ^{[1
]}

Lee, T ^{[1
]}

机构：

[1] Chinese Univ Hong Kong, Dept Elect Engn, Shatin, Hong Kong, Peoples R China

来源：

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA | 2001年 / 110卷 / 05期

关键词：

Cepstral mean normalization - Feature parameters - High frequency bands - Mel-frequency cepstral coefficients - Noisy speech recognition - Novel applications - Robust speech recognition - Wavelet packet filters;

D O I：

10.1121/1.1398054

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper describes a novel application of multiresolution analysis (MRA) in extracting acoustic features that possess de-noising capability for robust speech recognition. The MRA algorithm is used to construct a mel-scaled wavelet packet filter-bank, from which subband powers are computed as the feature parameters for speech recognition. Wiener filtering is applied to a few selected subbands at some intermediate stages of decomposition. For high-frequency bands, Wiener filters are designed based on a reduced fraction of the estimated noise power, making the consonant features much more prominent and contrastive. The proposed method is evaluated in phone recognition experiments with the MIT database. In the presence of stationary white noise at 10-dB SNR, the de-noised MRA features attain a phone recognition rate of 32%. There is a noticeable improvement compared with the accuracy of 29% and 20% attained by the commonly used mel-frequency cepstral coefficients (MFCC) with and without cepstral mean normalization (CMN), respectively. The effectiveness of the MRA features is also verified by the fact that they exhibit smaller distortion from clean speech. (C) 2001 Acoustical Society of America.

引用

页码：2567 / 2574

页数：8

共 34 条

[1]

ACERO A, 1990, INT CONF ACOUST SPEE, P849, DOI 10.1109/ICASSP.1990.115971

[2]

[Anonymous], HTK BOOK HTK VERSION

[3]

[Anonymous], 2012, ROBUSTNESS AUTOMATIC

[4]

[Anonymous], DARPA TIMIT AC PHON

[5]

BERSTEIN AD, 1991, INT CONF ACOUST SPEE, P913, DOI 10.1109/ICASSP.1991.150488

[6] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].

BOLL, SF .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120

[7]

Burrus C.S., 1998, introduction to Wavelets and Wavelet Transforms-A Primer

[8] Elimination of the Musical Noise Phenomenon with the Ephraim and Malah Noise Suppressor [J].

Cappe, Olivier .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02) :345-349

[9] NOISY SPEECH RECOGNITION BY USING VARIANCE ADAPTED HIDDEN MARKOV-MODELS [J].

CHIEN, JT ;

LEE, LM ;

WANG, HC .

ELECTRONICS LETTERS, 1995, 31 (18) :1555-1556

[10] Signal de-noising using adaptive Bayesian wavelet shrinkage [J].

Chipman, HA ;

Kolaczyk, ED ;

McCulloch, RE .

PROCEEDINGS OF THE IEEE-SP INTERNATIONAL SYMPOSIUM ON TIME-FREQUENCY AND TIME-SCALE ANALYSIS, 1996, :225-228

← 1 2 3 4 →