Improved Sparse NMF based Speech Enhancement Method with Deep Neural Network

被引:0
作者
Zou, Xia [1 ]
Zhang, Xiongwei [1 ]
Shi, Wenhua [1 ,2 ]
Wang, Fupeng [3 ]
Zhang, Jingtao [3 ]
Gao, Mingyue [3 ]
机构
[1] PLA, Army Engn Univ, Nanjing 210007, Jiangsu, Peoples R China
[2] Air Force Aviat Univ, Flight Training Base, Fuxin 123000, Peoples R China
[3] PLA, Unit 91285, Dalian 116000, Peoples R China
来源
PROCEEDINGS OF THE 2ND INTERNATIONAL FORUM ON MANAGEMENT, EDUCATION AND INFORMATION TECHNOLOGY APPLICATION (IFMEITA 2017) | 2017年 / 130卷
关键词
Speech enhancement; Deep neural network; Sparse non-negative matrix factorization; DATABASE;
D O I
暂无
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Considering the sparsity characteristic of speech signal in time-frequency domain and the non-linear model ability of deep neural network, an improved sparse non-negative matrix factorization based speech enhancement method is presented in this paper. Deep neural network is employed to learn the sparse encoding coefficients of speech and noise from noisy observation. The estimated clean speech is obtained by applying the wiener filter on the magnitude spectrogram of noisy speech. The experimental results show the superiority of proposed method under stationary and non-stationary conditions.
引用
收藏
页码:231 / 234
页数:4
相关论文
共 11 条
[1]  
Du J, 2008, INT CONF ACOUST SPEE, P4721
[2]  
Eggert J, 2004, IEEE IJCNN, P2529
[3]  
Loizou P. C., 2013, Speech enhancement: theory and practice, V2nd, DOI 10.1201/b14529
[4]   Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization [J].
Mohammadiha, Nasser ;
Smaragdis, Paris ;
Leijon, Arne .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (10) :2140-2151
[5]  
Nie S, 2016, INT CONF ACOUST SPEE, P469, DOI 10.1109/ICASSP.2016.7471719
[6]  
Rix AW, 2001, INT CONF ACOUST SPEE, P749, DOI 10.1109/ICASSP.2001.941023
[7]   ASSESSMENT FOR AUTOMATIC SPEECH RECOGNITION .2. NOISEX-92 - A DATABASE AND AN EXPERIMENT TO STUDY THE EFFECT OF ADDITIVE NOISE ON SPEECH RECOGNITION SYSTEMS [J].
VARGA, A ;
STEENEKEN, HJM .
SPEECH COMMUNICATION, 1993, 12 (03) :247-251
[8]   On Training Targets for Supervised Speech Separation [J].
Wang, Yuxuan ;
Narayanan, Arun ;
Wang, DeLiang .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) :1849-1858
[9]  
Weninger F., 2014, P C INT SPEECH COMM, P543
[10]   A Regression Approach to Speech Enhancement Based on Deep Neural Networks [J].
Xu, Yong ;
Du, Jun ;
Dai, Li-Rong ;
Lee, Chin-Hui .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (01) :7-19