DNN-Based Speech Enhancement via Integrating NMF and CASA

被引:0
作者
Yan, Bofang [1 ]
Bao, Changchun [1 ]
Bai, Zhigang [1 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing, Peoples R China
来源
2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP) | 2018年
基金
中国国家自然科学基金;
关键词
Deep neural network; nonnegative matrix factorization; computational auditory scene analysis; Wiener filter; speech enhancement; MONAURAL SPEECH;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we come up with a novel speech enhancement method, which integrates nonnegative matrix factorization (NMF) and computational auditory scene analysis (CASA) using deep neural network (DNN). Firstly, we can obtain the basis matrices of speech and noise respectively via NMF and get the ideal ratio mask (IRM) that is based on CASA by using deep neural network. Then, a linear minimum mean square error (LMMSE) filter in fast Fourier transform (FFT) domain is constructed and transformed to the Gammatone domain. Finally, an integrated Wiener-like filter is obtained by combining the filter of NMF with the mask of CASA. By comparing with NMF and CASA methods, the experiments present the superiority of the proposed method.
引用
收藏
页码:435 / 439
页数:5
相关论文
共 19 条
[1]   Simultaneous detection and estimation approach for speech enhancement [J].
Abramson, Ari ;
Cohen, Israel .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (08) :2348-2359
[2]  
[Anonymous], 1988, Objective measures of speech quality
[3]  
[Anonymous], 2011, PROC IEEE INT S INTE
[4]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[5]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02) :443-445
[6]  
Hao-Teng Fan, 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), P4483, DOI 10.1109/ICASSP.2014.6854450
[7]   Monaural speech segregation based on pitch tracking and amplitude modulation [J].
Hu, GN ;
Wang, DL .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2004, 15 (05) :1135-1150
[8]  
Kim Gibak, 2009, JOURAL ACOUSTICAL SO, V126
[9]   Monaural speech separation based on computational auditory scene analysis and objective quality assessment of speech [J].
Li, Peng ;
Guan, Yong ;
Xu, Bo ;
Liu, Wenju .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (06) :2014-2023
[10]   ENHANCEMENT AND BANDWIDTH COMPRESSION OF NOISY SPEECH [J].
LIM, JS ;
OPPENHEIM, AV .
PROCEEDINGS OF THE IEEE, 1979, 67 (12) :1586-1604