A STUDY ON JOINT BEAMFORMING AND SPECTRAL ENHANCEMENT FOR ROBUST SPEECH RECOGNITION IN REVERBERANT ENVIRONMENTS

被引:0
作者
Xiong, Feifei [1 ]
Meyer, Bernd T. [2 ,3 ]
Goetze, Stefan [1 ,3 ]
机构
[1] Fraunhofer Inst Digital Media Technol IDMT, Project Grp Hearing Speech & Audio Technol HSA, Oldenburg, Germany
[2] Carl von Ossietzky Univ Oldenburg, Dept Med Phys & Acoust, Oldenburg, Germany
[3] Cluster Excellence Hearing4All, Oldenburg, Germany
来源
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年
关键词
Speech dereverberation; minimum variance distortionless response beamformer; minimum mean square error estimator; late reverberation spectral variance; speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This work evaluates multi-microphone beamforming and single-microphone spectral enhancement strategies to alleviate the reverberation effect for robust automatic speech recognition (ASR) systems in different reverberant environments characterized by different reverberation times T-60 and direct-to-reverberation ratios (DRRs). The systems consist of minimum variance distortionless response (MVDR) beamformers in combination with minimum mean square error (MMSE) estimators, and late reverberation spectral variance (LRSV) estimators, the latter employing a generalized model of the room impulse response (RIR). Various system architectures are analyzed with a focus on optimal speech recognition performance. The system combining an MVDR beamformer and a subsequent MMSE estimator was found to lead to the best results, with relative reductions of 27.7% compared to the baseline system. This is attributed to a more accurate LRSV estimate from spatial averaging and diffuse field refinement for the MMSE estimator.
引用
收藏
页码:5043 / 5047
页数:5
相关论文
共 35 条
[1]   MULTI-MICROPHONE SIGNAL-PROCESSING TECHNIQUE TO REMOVE ROOM REVERBERATION FROM SPEECH SIGNALS [J].
ALLEN, JB ;
BERKLEY, DA ;
BLAUERT, J .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 62 (04) :912-915
[2]  
[Anonymous], 2000, Room acoustics
[3]  
[Anonymous], 1993, Array Signal Processing: Concepts and Techniques, DOI DOI 10.1016/j.visres.2004.01.005
[4]  
[Anonymous], 2001, ITU-T Rec. P. 862
[5]  
[Anonymous], 1965, J ACOUST SOC AM, DOI DOI 10.1121/1.1909343
[6]  
Bitzer J, 2001, DIGITAL SIGNAL PROC, P19
[7]   Parameterized MMSE spectral magnitude estimation for the enhancement of noisy speech [J].
Breithaupt, Colin ;
Krawczyk, Martin ;
Martin, Rainer .
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :4037-4040
[8]  
Cauchi B., 2014, REVERB CHALLENGE
[9]  
Eaton J, 2013, INT CONF ACOUST SPEE, P161, DOI 10.1109/ICASSP.2013.6637629
[10]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06) :1109-1121