Binaural speech enhancement system combining dereverberation and spatial masking-based noise removal for robust speech recognition

被引：0

作者：

Tien Dung Tran ^{[1
]}

Dang Khoa Nguyen ^{[1
]}

Quoc Cuong Nguyen ^{[2
]}

Huu Binh Nguyen ^{[2
]}

机构：

[1] Hanoi Univ Sci & Technol, MICA Inst, Hanoi, Vietnam

[2] Hanoi Univ Sci & Technol, Sch Elect Engn, Hanoi, Vietnam

来源：

2012 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS (ICCE) | 2012年

关键词：

Binaural speech enhancement; dereverberation; spatial mask; automatic speech recognition; SEPARATION;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we present a binaural speech enhancement system as a preprocessing step for the robust speech recognition. This system employs an existing dereverberation technique followed by a spatial masking-based noise removal algorithm where only signals coming from the desired directions are retained by using a threshold angle. While state-of-the art approaches fix the threshold angle heuristically over all time frames, we propose to consider an adaptive computation where this threshold angle is first learned in several noise-only frames and then updated frame by frame. Speech recognition results in real environment show the effectiveness of the proposed speech enhancement approach.

引用

页码：345 / 350

页数：6

共 14 条

[1]

[Anonymous], TECH REP

[2]

[Anonymous], P INTERSPEECH

[3] Robust automatic speech recognition with missing and unreliable acoustic data [J].

Cooke, M ;

Green, P ;

Josifovski, L ;

Vizinho, A .

SPEECH COMMUNICATION, 2001, 34 (03) :267-285

[4] Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model [J].

Duong, Ngoc Q. K. ;

Vincent, Emmanuel ;

Gribonval, Remi .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (07) :1830-1840

[5]

Kajita Fumitada Itakura Shoji, 1997, IEEE C AC SPEECH SIG

[6]

Kim C., 2011, P IEEE INT C AC SPEE, P4574

[7] Improving Speech Intelligibility in Noise Using Environment-Optimized Algorithms [J].

Kim, Gibak ;

Loizou, Philipos C. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08) :2080-2090

[8]

Kleinschmidt Tristan Friedrich, 2010, THESIS

[9]

Makino S, 2007, SIGNALS COMMUN TECHN, P1, DOI 10.1007/978-1-4020-6479-1

[10] Spatial separation of speech signals using amplitude estimation based on interaural comparisons of zero-crossings [J].

Park, Hyung-Min ;

Stern, Richard M. .

SPEECH COMMUNICATION, 2009, 51 (01) :15-25

← 1 2 →