DNN Based Speech Enhancement for Unseen Noises Using Monte Carlo Dropout

被引:0
作者
Nazreen, P. M. [1 ]
Ramakrishnan, A. G. [1 ]
机构
[1] Indian Inst Sci, Dept Elect Engn, MILE Lab, Bangalore 560012, Karnataka, India
来源
2018 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS) | 2018年
关键词
speech enhancement; deep neural networks; DNN; dropout; unseen noise; Monte Carlo; model uncertainty; SQUARE ERROR ESTIMATION; RECOGNITION; SIGNALS;
D O I
暂无
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
In this work, we propose the use of dropout as a Bayesian estimator for increasing the generalizability of a deep neural network (DNN) for speech enhancement. By using Monte Carlo (MC) dropout, we explore whether the DNN can accomplish better enhancement in unseen noisy conditions. Two DNNs are trained on speech corrupted with five different noises at three SNRs, one using conventional dropout and other with MC dropout and tested on speech with unseen noises. Speech samples are obtained from the TIMIT database and noises from NOISEX-92. In another experiment, we train five DNN models separately on speech corrupted with five different noises, at three SNRs. The model precision estimated using MC dropout is used as a proxy for squared error to dynamically select the best of the DNN models based on their performance on each frame of test data. The first set of experiments aims at improving the performance of an existing DNN with conventional dropout for unseen noises, by replacing the conventional dropout with MC dropout. The second set of experiments aims at finding an optimal way of choosing the best DNN model for de-noising when multiple noise-specific DNN models are available, for unseen noisy conditions.
引用
收藏
页数:6
相关论文
共 33 条
[1]  
[Anonymous], 1993, NASA STI/Recon technical report n
[2]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[3]  
Dahl GE, 2013, INT CONF ACOUST SPEE, P8609, DOI 10.1109/ICASSP.2013.6639346
[4]   A BAYESIAN-ESTIMATION APPROACH FOR SPEECH ENHANCEMENT USING HIDDEN MARKOV-MODELS [J].
EPHRAIM, Y .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1992, 40 (04) :725-735
[5]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06) :1109-1121
[6]   Minimum mean-square error estimation of discrete fourier coefficients with generalized gamma priors [J].
Erkelens, Jan S. ;
Hendriks, Richard C. ;
Heusdens, Richard ;
Jensen, Jesper .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (06) :1741-1752
[7]  
Gal Y, 2016, PR MACH LEARN RES, V48
[8]   Reducing the dimensionality of data with neural networks [J].
Hinton, G. E. ;
Salakhutdinov, R. R. .
SCIENCE, 2006, 313 (5786) :504-507
[9]   A fast learning algorithm for deep belief nets [J].
Hinton, Geoffrey E. ;
Osindero, Simon ;
Teh, Yee-Whye .
NEURAL COMPUTATION, 2006, 18 (07) :1527-1554
[10]   Evaluation of objective quality measures for speech enhancement [J].
Hu, Yi ;
Loizou, Philipos C. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (01) :229-238