A Dual Microphone Speech Enhancement Method with A Smoothing Parameter Mask

被引:0
作者
Jiang, Yi [1 ]
Liu, Runsheng [2 ]
机构
[1] CPLA, Quartermaster Equipment Res Inst, Beijing, Peoples R China
[2] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
来源
2017 10TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI) | 2017年
关键词
speech enhancement; computational auditory scene analysis (CASA); deep neural networks (DNNs); dual microphone; parameter masks; CLASSIFICATION;
D O I
暂无
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
The dual microphone system provides high quality of speech in noisy conditions, and is always an important topic in signal processing. In this paper, a computational auditory scene analysis (CASA) based speech enhancement method with a smoothing parameter mask is proposed. With a flexible dual microphone setting, we focus on the speech enhancement between the matched and unmatched training and test conditions. Cooperated with a deep neural network (DNN), the parameter mask is estimated and smoothing with simulated and recording data. The recording data is used to smooth the estimated parameter mask trained with simulate data as a transition to real application. We use recording data to train the DNN. The various configuration recording data are used to test the proposed speech segregation system right away. The proposed system has a positive results on trained and untrained conditions and low signal to noise ratio (SNR) test conditions. It also has a good performance on an office application.
引用
收藏
页数:5
相关论文
共 11 条
[1]  
Blauert J., 1996, Spatial hearing: the psychophysics of human sound localization
[2]  
Bregman A., 1990, Auditory Scene Analysis: The Perceptual Organization of Sound, DOI DOI 10.7551/MITPRESS/1486.001.0001
[3]  
Campbell D. R., 2005, Computing and Information Systems, V9, P48
[4]  
Garofolo J. S., 1993, TIMIT ACOUSTIC PHONE
[5]  
Jiang Y., 2014, P INTERSPEECH, P2400
[6]   Binaural Classification for Reverberant Speech Segregation Using Deep Neural Networks [J].
Jiang, Yi ;
Wang, DeLiang ;
Liu, RunSheng ;
Feng, ZhenMing .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) :2112-2121
[7]  
Liu R. S., 2016, P INT C IM SIGN PROC, P974
[8]   ASSESSMENT FOR AUTOMATIC SPEECH RECOGNITION .2. NOISEX-92 - A DATABASE AND AN EXPERIMENT TO STUDY THE EFFECT OF ADDITIVE NOISE ON SPEECH RECOGNITION SYSTEMS [J].
VARGA, A ;
STEENEKEN, HJM .
SPEECH COMMUNICATION, 1993, 12 (03) :247-251
[9]  
Wang D, 2006, Computational auditory scene analysis: Principles, algorithms, and applications
[10]   Towards Scaling Up Classification-Based Speech Separation [J].
Wang, Yuxuan ;
Wang, DeLiang .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (07) :1381-1390