Speech enhancement from fused features based on deep neural network and gated recurrent unit network

被引：0

作者：

Youming Wang

Jiali Han

Tianqi Zhang

Didi Qing

机构：

[1] Xi’an University of Posts and Telecommunications,School of Automation

[2] Xi’an Key Laboratory of Advanced Control and Intelligent Process (ACIP),undefined

来源：

EURASIP Journal on Advances in Signal Processing | / 2021卷

关键词：

Speech enhancement; Deep neural network; Gated recurrent unit; Speech quality;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Speech is easily interfered by external environment in reality, which results in the loss of important features. Deep learning has become a popular speech enhancement method because of its superior potential in solving nonlinear mapping problems for complex features. However, the deficiency of traditional deep learning methods is the weak learning capability of important information from previous time steps and long-term event dependencies between the time-series data. To overcome this problem, we propose a novel speech enhancement method based on the fused features of deep neural networks (DNNs) and gated recurrent unit (GRU). The proposed method uses GRU to reduce the number of parameters of DNNs and acquire the context information of the speech, which improves the enhanced speech quality and intelligibility. Firstly, DNN with multiple hidden layers is used to learn the mapping relationship between the logarithmic power spectrum (LPS) features of noisy speech and clean speech. Secondly, the LPS feature of the deep neural network is fused with the noisy speech as the input of GRU network to compensate the missing context information. Finally, GRU network is performed to learn the mapping relationship between LPS features and log power spectrum features of clean speech spectrum. The proposed model is experimentally compared with traditional speech enhancement models, including DNN, CNN, LSTM and GRU. Experimental results demonstrate that the PESQ, SSNR and STOI of the proposed algorithm are improved by 30.72%, 39.84% and 5.53%, respectively, compared with the noise signal under the condition of matched noise. Under the condition of unmatched noise, the PESQ and STOI of the algorithm are improved by 23.8% and 37.36%, respectively. The advantage of the proposed method is that it uses the key information of features to suppress noise in both matched and unmatched noise cases and the proposed method outperforms other common methods in speech enhancement.

引用

共 50 条

[41] Speech Enhancement Algorithm Combining Cochlear Features and Deep Neural Network with Skip Connections [J].

Chaofeng Lan ;

Yuqiao Wang ;

Lei Zhang ;

Zelong Yu ;

Chundong Liu ;

Xiaoxia Guo .

Journal of Signal Processing Systems, 2023, 95 :979-989

[42] Enhancement of speech using deep neural network with discrete cosine transform [J].

Ram, Rashmirekha ;

Mohanty, Mihir Narayan .

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 35 (01) :141-148

[43] Speech enhancement with stacked frames and deep neural network for VoIP applications [J].

Liu, Jiantao ;

Yang, Xiaoxiang ;

Zhu, Mingzhu ;

He, Bingwei .

17TH INTERNATIONAL CONFERENCE ON OPTICAL COMMUNICATIONS AND NETWORKS (ICOCN2018), 2019, 11048

[44] Monaural Speech Enhancement using Deep Neural Network with Cross-Speech Dataset [J].

Jamal, Norezmi ;

Fuad, Norfaiza ;

Shanta, Shahnoor ;

Sha'abani, Mohd Nurul Al-Hafiz .

2021 IEEE INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING APPLICATIONS, IEEE ICSIPA 2021, 2021, :44-49

[45] Speech signal-based accurate neurological disorders detection using convolutional neural network and recurrent neural network based deep network [J].

Soylu, Emel ;

Guel, Sema ;

Koca, Kuebra Aslan ;

Tuerkoglu, Muammer ;

Terzi, Murat ;

Senguer, Abdulkadir .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 149

[46] Towards Efficient Recurrent Architectures: A Deep LSTM Neural Network Applied to Speech Enhancement and Recognition [J].

Wang, Jing ;

Saleem, Nasir ;

Gunawan, Teddy Surya .

COGNITIVE COMPUTATION, 2024, 16 (03) :1221-1236

[47] Development of Fused Recurrent Neural Network-Gated Recurrent Unit with Improved Sandpiper Optimization-based Intelligent Node Localization Framework in WSN [J].

Raghuvanshi, Akash ;

Kumar, Awadhesh ;

Chandra, Nilesh .

INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2024, 37 (13)

[48] Real Time Human Activity Recognition Using Convolutional Neural Network and Deep Gated Recurrent Unit [J].

Fajar, Rasyid ;

Suciati, Nanik ;

Navastara, Dini Adni .

2020 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATICS (ICELTICS 2020), 2020, :58-63

[49] Improving Deep Neural Network Based Speech Enhancement in Low SNR Environments [J].

Gao, Tian ;

Du, Jun ;

Xu, Yong ;

Liu, Cong ;

Dai, Li-Rong ;

Lee, Chin-Hui .

LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION, LVA/ICA 2015, 2015, 9237 :75-82

[50] Deep neural network based speech enhancement using mono channel mask [J].

Pallavi P. Ingale ;

Sanjay L. Nalbalwar .

International Journal of Speech Technology, 2019, 22 :841-850

← 1 2 3 4 5 →