Weight-Space Viterbi Decoding Based Spectral Subtraction for Reverberant Speech Recognition

被引：8

作者：

Ban, Sung Min ^{[1
]}

Kim, Hyung Soon ^{[1
]}

机构：

[1] Pusan Natl Univ, Dept Elect Engn, Pusan 609735, South Korea

来源：

IEEE SIGNAL PROCESSING LETTERS | 2015年 / 22卷 / 09期

关键词：

Dereverberation; spectral subtraction; speech recognition; viterbi decoding; MODEL ADAPTATION; DEREVERBERATION; SUPPRESSION;

D O I：

10.1109/LSP.2015.2408371

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

A single-channel blind dereverberation algorithm is proposed in this letter for distant-talking speech recognition. The proposed method is based on spectral subtraction (SS) method, in which the spectrum of a late reverberant signal is estimated using a delayed and attenuated version of the reverberant signal. Through some assumptions, the conventional SS method regards the attenuation weight as a constant that is a function of reverberation time. However, these assumptions are not valid in real situations, and the ideal weight varies with the frame. Therefore, in the proposed method, the variable weight sequence is estimated using Viterbi decoding scheme based on the reverberation model. This weight sequence is then substituted for the fixed weight in the conventional SS method without explicitly estimating the reverberation time. The proposed method performs better than the conventional SS method in both isolated word recognition and connected digit recognition experiments in reverberant environments.

引用

页码：1424 / 1428

页数：5

共 24 条

[1]

[Anonymous], 2015, The HTK book

[2] MVA processing of speech features [J].

Chen, Chia-Ping ;

Bilmes, Jeff A. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01) :257-270

[3] Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition [J].

Dahl, George E. ;

Yu, Dong ;

Deng, Li ;

Acero, Alex .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01) :30-42

[4] Dereverberation and denoising using multichannel linear prediction [J].

Delcroix, Marc ;

Hikichi, Takafumi ;

Miyoshi, Masato .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (06) :1791-1801

[5]

Eaton J, 2013, INT CONF ACOUST SPEE, P161, DOI 10.1109/ICASSP.2013.6637629

[6] Late Reverberant Spectral Variance Estimation Based on a Statistical Model [J].

Habets, Emanuel A. P. ;

Gannot, Sharon ;

Cohen, Israel .

IEEE SIGNAL PROCESSING LETTERS, 2009, 16 (09) :770-773

[7] A new approach for the adaptation of HMMs to reverberation and background noise [J].

Hirsch, Hans-Guenter ;

Finster, Harald .

SPEECH COMMUNICATION, 2008, 50 (03) :244-263

[8]

HIRSCH HG, 2000, ISCA ITRW ASR 2000 S

[9]

Jeub M, 2009, P INT C DIG SIGN PRO, P1

[10] Suppression of Late Reverberation Effect on Speech Signal Using Long-Term Multiple-step Linear Prediction [J].

Kinoshita, Keisuke ;

Delcroix, Marc ;

Nakatani, Tomohiro ;

Miyoshi, Masato .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (04) :1-12

← 1 2 3 →