DNN-BASED AR-WIENER FILTERING FOR SPEECH ENHANCEMENT

被引:0
作者
Yang, Yan [1 ]
Bao, Changchun [1 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing, Peoples R China
来源
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年
基金
中国国家自然科学基金;
关键词
speech enhancement; deep neural network; auto-regressive model; speech-presence probability; Wiener filter;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel approach for estimating auto-regressive (AR) model parameters using deep neural network (DNN) in the AR-Wiener filtering speech enhancement. Unlike conventional DNN that predicts one kind of target, the DNN used in this paper is trained to predict the AR model parameters of speech and noise simultaneously at offline stage. We train this network by minimizing the Euclidean distance between the output of DNN and the AR model parameters of clean speech and noise. At online stage, the acoustic features are first extracted from noisy speech as the input of the DNN. Then, AR model parameters of speech and noise are estimated by the DNN simultaneously. Finally, the Wiener filter is constructed by the AR model parameters of speech and noise. However, the AR model parameters only models the spectral shape not the spectral details, there are still some residual noise between the harmonics. In order to solve this problem, we introduce the speech-presence probability (SPP), that is, in the test stage, the SPP is estimated and is used to update the Wiener filter. The experimental results show that our approach has higher performance compared with some existing approaches.
引用
收藏
页码:2901 / 2905
页数:5
相关论文
共 22 条
[1]  
[Anonymous], 1988, Objective measures of speech quality
[2]  
[Anonymous], 2007, Speech Enhancement: Theory and Practice
[3]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[4]  
Chang CC, 1998, IEEE T CONSUM ELECTR, V44, P1201
[5]  
Deng Feng, 2015, IEEE T AUDIO SPEECH, V23, P163
[6]   A SIGNAL SUBSPACE APPROACH FOR SPEECH ENHANCEMENT [J].
EPHRAIM, Y ;
VANTREES, HL .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (04) :251-266
[7]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06) :1109-1121
[8]  
Glorot X., 2011, P 14 INT C ART INT S, P315, DOI DOI 10.1002/ECS2.1832
[9]  
He Q, 2016, INT CONF ACOUST SPEE, P5230, DOI 10.1109/ICASSP.2016.7472675
[10]  
Kang F. K, 1988, IEEE J SELCTED AREAS, P432