The Application of Deep Neural Network in Speech Enhancement Processing

被引：0

作者：

Chen Jian-ming ^{[1
]}

Liang Zhi-cheng ^{[1
]}

机构：

[1] Army Acad Armored Forces, Dept Informat & Commun, Beijing, Peoples R China

来源：

2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018) | 2018年

关键词：

Time-frequency analysis; Speech enhancement algorithm; Ensemble Empirical Mode Decomposition; Deep Neural Network; EMPIRICAL MODE DECOMPOSITION;

D O I：

10.1109/ICISCE.2018.00257

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To solve the problem that Non-stationary noise is difficult to remove during speech enhancement process when using Fourier transform, this essay will put forward a speech enhancement algorithm based on the combination of Ensemble Empirical Mode Decomposition (EEMD) and Deep Neural Network (DNN). Firstly, preprocessing the original signal by EEMD, and decomposing a series of time-frequency information of the IMF component to meet the time-variation requirement better; Secondly, adjusting the weight of the IMF component by DNN and then synthesize it to enhanced the speech; Finally, comparing the differences of speech enhancement performance between using EEMD alone, using Fourier transform and EEMD as a preprocessing. The results show that the enhanced algorithm using EEMD as a preprocessing improves the scores of PESQ and STOI by 0.745 and 0.169 respectively, effectively improving the speech quality and intelligibility.

引用

页码：1263 / 1266

页数：4

共 11 条

[1]

[Anonymous], 2007, Speech Enhancement: Theory and Practice

[2]

[Anonymous], EEMD METHOD ITS APPL

[3]

Daubechies I., 1992, 10 LECT WAVELETS, DOI DOI 10.1137/1.9781611970104

[4] The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis [J].

Huang, NE ;

Shen, Z ;

Long, SR ;

Wu, MLC ;

Shih, HH ;

Zheng, QN ;

Yen, NC ;

Tung, CC ;

Liu, HH .

PROCEEDINGS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 1998, 454 (1971) :903-995

[5] A review on Hilbert-Huang transform: method and its applications to geophysical studies [J].

Huang, Norden E. ;

Wu, Zhaohua .

REVIEWS OF GEOPHYSICS, 2008, 46 (02)

[6] Deep learning [J].

LeCun, Yann ;

Bengio, Yoshua ;

Hinton, Geoffrey .

NATURE, 2015, 521 (7553) :436-444

[7]

Ming C., 2013, MATLAB NEURAL NETWOR

[8]

Rix AW, 2001, INT CONF ACOUST SPEE, P749, DOI 10.1109/ICASSP.2001.941023

[9] A SHORT-TIME OBJECTIVE INTELLIGIBILITY MEASURE FOR TIME-FREQUENCY WEIGHTED NOISY SPEECH [J].

Taal, Cees H. ;

Hendriks, Richard C. ;

Heusdens, Richard ;

Jensen, Jesper .

2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, :4214-4217

[10] ENSEMBLE EMPIRICAL MODE DECOMPOSITION: A NOISE-ASSISTED DATA ANALYSIS METHOD [J].

Wu, Zhaohua ;

Huang, Norden E. .

ADVANCES IN DATA SCIENCE AND ADAPTIVE ANALYSIS, 2009, 1 (01) :1-41

← 1 2 →