Residual Unet with Attention Mechanism for Time-Frequency Domain Speech Enhancement

被引：0

作者：

Chen, Hanyu ^{[1
]}

Peng, Xiwei ^{[1
]}

Jiang, Qiqi ^{[1
]}

Guo, Yujie ^{[1
]}

机构：

[1] Beijing Inst Technol, Sch Automat, Beijing 100081, Peoples R China

来源：

2022 41ST CHINESE CONTROL CONFERENCE (CCC) | 2022年

关键词：

Speech enhancement; Unet; residual unit; attention gating;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Eliminating the negative effects of background environmental noise is an interesting and challenging task in audio processing. In recent years, denoising technology based on neural networks (NN) has achieved good performance. In particular, the structure based on the convolutional encoder and decoder has been proven to achieve good enhancement effects. On this basis, this paper proposes a residual unet structure combined with the attention mechanism. Effectively reduce the impact of gradient disappearance on network training, and improve the semantic gap between encoder output and decoder output due to unet shortcut connections. The experimental results show that compared with the DNN baseline and unet network, the enhanced voice quality has been significantly improved.

引用

页码：7007 / 7011

页数：5

共 19 条

[1] Erdogan H, 2015, INT CONF ACOUST SPEE, P708, DOI 10.1109/ICASSP.2015.7178061
[2] Gao T, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P5054, DOI 10.1109/ICASSP.2018.8461861
[3] Garofolo J., 1993, NIST speech disc 1-1.1
[4] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[5] MultiResUNet : Rethinking the U-Net architecture for multimodal biomedical image segmentation
Ibtehaz, Nabil
Rahman, M. Sohel
[J]. NEURAL NETWORKS, 2020, 121 : 74 - 87
[6] Kinoshita K, 2020, INT CONF ACOUST SPEE, P7009, DOI [10.1109/ICASSP40776.2020.9053266, 10.1109/icassp40776.2020.9053266]
[7] Loizou P.C., 2013, SPEECH ENHANCEMENT T
[8] Oktay O., 2018, ARXIV
[9] Improving GANs for Speech Enhancement
Phan, Huy
McLoughlin, Ian V.
Pham, Lam
Chen, Oliver Y.
Koch, Philipp
De Vos, Maarten
Mertins, Alfred
[J]. IEEE SIGNAL PROCESSING LETTERS, 2020, 27 (27) : 1700 - 1704
[10] Piczak Karol J., 2015, P 23 ANN ACM C MULT

← 1 2 →