The Application of Deep Neural Network in Speech Enhancement Processing

被引:0
作者
Chen Jian-ming [1 ]
Liang Zhi-cheng [1 ]
机构
[1] Army Acad Armored Forces, Dept Informat & Commun, Beijing, Peoples R China
来源
2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018) | 2018年
关键词
Time-frequency analysis; Speech enhancement algorithm; Ensemble Empirical Mode Decomposition; Deep Neural Network; EMPIRICAL MODE DECOMPOSITION;
D O I
10.1109/ICISCE.2018.00257
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To solve the problem that Non-stationary noise is difficult to remove during speech enhancement process when using Fourier transform, this essay will put forward a speech enhancement algorithm based on the combination of Ensemble Empirical Mode Decomposition (EEMD) and Deep Neural Network (DNN). Firstly, preprocessing the original signal by EEMD, and decomposing a series of time-frequency information of the IMF component to meet the time-variation requirement better; Secondly, adjusting the weight of the IMF component by DNN and then synthesize it to enhanced the speech; Finally, comparing the differences of speech enhancement performance between using EEMD alone, using Fourier transform and EEMD as a preprocessing. The results show that the enhanced algorithm using EEMD as a preprocessing improves the scores of PESQ and STOI by 0.745 and 0.169 respectively, effectively improving the speech quality and intelligibility.
引用
收藏
页码:1263 / 1266
页数:4
相关论文
共 11 条
[1]  
[Anonymous], 2007, Speech Enhancement: Theory and Practice
[2]  
[Anonymous], EEMD METHOD ITS APPL
[3]  
Daubechies I., 1992, 10 LECT WAVELETS, DOI DOI 10.1137/1.9781611970104
[4]   The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis [J].
Huang, NE ;
Shen, Z ;
Long, SR ;
Wu, MLC ;
Shih, HH ;
Zheng, QN ;
Yen, NC ;
Tung, CC ;
Liu, HH .
PROCEEDINGS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 1998, 454 (1971) :903-995
[5]   A review on Hilbert-Huang transform: method and its applications to geophysical studies [J].
Huang, Norden E. ;
Wu, Zhaohua .
REVIEWS OF GEOPHYSICS, 2008, 46 (02)
[6]   Deep learning [J].
LeCun, Yann ;
Bengio, Yoshua ;
Hinton, Geoffrey .
NATURE, 2015, 521 (7553) :436-444
[7]  
Ming C., 2013, MATLAB NEURAL NETWOR
[8]  
Rix AW, 2001, INT CONF ACOUST SPEE, P749, DOI 10.1109/ICASSP.2001.941023
[9]   A SHORT-TIME OBJECTIVE INTELLIGIBILITY MEASURE FOR TIME-FREQUENCY WEIGHTED NOISY SPEECH [J].
Taal, Cees H. ;
Hendriks, Richard C. ;
Heusdens, Richard ;
Jensen, Jesper .
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, :4214-4217
[10]   ENSEMBLE EMPIRICAL MODE DECOMPOSITION: A NOISE-ASSISTED DATA ANALYSIS METHOD [J].
Wu, Zhaohua ;
Huang, Norden E. .
ADVANCES IN DATA SCIENCE AND ADAPTIVE ANALYSIS, 2009, 1 (01) :1-41