Two-stage binaural speech enhancement with Wiener filter for high-quality speech communication

被引:43
作者
Li, Junfeng [1 ]
Sakamoto, Shuichi [2 ]
Hongo, Satoshi [3 ]
Akagi, Masato [1 ]
Suzuki, Yoiti [2 ]
机构
[1] Japan Adv Inst Sci & Technol, Sch Informat Sci, Tokyo, Japan
[2] Tohoku Univ, Elect Commun Res Inst, Sendai, Miyagi 980, Japan
[3] Miyagi Natl Coll Technol, Dept Design & Comp Applicat, Sendai, Miyagi, Japan
关键词
Binaural masking level difference; Equalization-cancellation model; Two-stage binaural speech enhancement (TS-BASE); Binaural cue preservation; Sound localization; ARRAY HEARING-AIDS; NOISE-REDUCTION; ENVIRONMENTS; OUTPUT;
D O I
10.1016/j.specom.2010.04.009
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech enhancement has been researched extensively for many years to provide high-quality speech communication in the presence of background noise and concurrent interference signals. Human listening is robust against these acoustic interferences using only two ears, but state-of-the-art two-channel algorithms function poorly. Motivated by psychoacoustic studies of binaural hearing (equalization-cancellation (EC) theory), in this paper, we propose a two-stage binaural speech enhancement with Wiener filter (TS-BASE/WF) approach that is a two-input two-output system. In this proposed TS-BASE/WF, interference signals are first estimated by equalizing and cancelling the target signal in a way inspired by the EC theory, a time-variant Wiener filter is then applied to enhance the target signal given the noisy mixture signals. The main advantages of the proposed TS-BASE/WF are (1) effectiveness in dealing with non-stationary multiple-source interference signals, and (2) success in preserving binaural cues after processing. These advantages were confirmed according to the comprehensive objective and subjective evaluations in different acoustical spatial configurations in terms of speech enhancement and binaural cue preservation. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:677 / 689
页数:13
相关论文
共 33 条
[1]  
AICHNER R, 2007, P ICASSP2007
[2]  
[Anonymous], 1988, Objective measures of speech quality
[3]  
[Anonymous], 2001, MICROPHONE ARRAYS SI
[4]  
[Anonymous], 2007, Speech Enhancement: Theory and Practice
[5]  
Blauert J., 1997, Spatial hearing: the psychophysics of human sound localization
[6]  
BOGAERT TV, 2007, ICASSP2007, P565
[7]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[8]   Speech enhancement using sub-band adaptive Griffiths-Jim signal processing [J].
Campbell, DR ;
Shields, PW .
SPEECH COMMUNICATION, 2003, 39 (1-2) :97-110
[9]   Elimination of the Musical Noise Phenomenon with the Ephraim and Malah Noise Suppressor [J].
Cappe, Olivier .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02) :345-349
[10]   Microphone-array hearing aids with binaural output .1. Fixed-processing systems [J].
Desloge, JG ;
Rabinowitz, WM ;
Zurek, PM .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (06) :529-542