SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT

被引：18

作者：

Huy Phan ^{[1
]}

Nguyen, Huy Le ^{[2
]}

Chen, Oliver Y. ^{[3
]}

Koch, Philipp ^{[4
]}

Duong, Ngoc Q. K. ^{[5
]}

McLoughlin, Ian ^{[6
]}

Mertins, Alfred ^{[4
]}

机构：

[1] Queen Mary Univ London, London, England

[2] HCMC Univ Technol, Ho Chi Minh City, Vietnam

[3] Univ Oxford, Oxford, England

[4] Univ Lubeck, Lubeck, Germany

[5] InterDigital R&D France, Paris, France

[6] Singapore Inst Technol, Singapore, Singapore

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

关键词：

Speech enhancement; self-attention; generative adversarial network; GAN; SEGAN;

D O I：

10.1109/ICASSP39728.2021.9414265

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input. To remedy this issue, we propose a self-attention layer adapted from non-local attention, coupled with the convolutional and deconvolutional layers of a speech enhancement GAN (SEGAN) using raw signal input. Further, we empirically study the effect of placing the self-attention layer at the (de)convolutional layers with varying layer indices as well as at all of them when memory allows. Our experiments show that introducing self-attention to SEGAN leads to consistent improvement across the objective evaluation metrics of enhancement performance. Furthermore, applying at different (de)convolutional layers does not significantly alter performance, suggesting that it can be conveniently applied at the highest-level (de)convolutional layer with the smallest memory overhead(1).

引用

页码：7103 / 7107

页数：5

共 50 条

[1] SASEGAN-TCN: Speech enhancement algorithm based on self-attention generative adversarial network and temporal convolutional network
Lv R.
Chen N.
Cheng S.
Fan G.
Rao L.
Song X.
Lv W.
Yang D.
Mathematical Biosciences and Engineering, 2024, 21 (03) : 3860 - 3875
[2] Self-attention generative adversarial network with the conditional constraint
Jia Y.
Ma L.
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (06): : 163 - 170
[3] Light-Weight Self-Attention Augmented Generative Adversarial Networks for Speech Enhancement
Li, Lujun
Lu, Zhenxing
Watzel, Tobias
Kurzinger, Ludwig
Rigoll, Gerhard
ELECTRONICS, 2021, 10 (13)
[4] Dialogue Generation Using Self-Attention Generative Adversarial Network
Hatua, Amartya
Nguyen, Trung T.
Sung, Andrew H.
2019 IEEE INTERNATIONAL CONFERENCE ON CONVERSATIONAL DATA & KNOWLEDGE ENGINEERING (CDKE), 2019, : 33 - 38
[5] Multi-scale self-attention generative adversarial network for pathology image restoration
Liang, Meiyan
Zhang, Qiannan
Wang, Guogang
Xu, Na
Wang, Lin
Liu, Haishun
Zhang, Cunlin
VISUAL COMPUTER, 2023, 39 (09) : 4305 - 4321
[6] BaMSGAN: Self-Attention Generative Adversarial Network with Blur and Memory for Anime Face Generation
Li, Xu
Li, Bowei
Fang, Minghao
Huang, Rui
Huang, Xiaoran
MATHEMATICS, 2023, 11 (20)
[7] Multi-scale self-attention generative adversarial network for pathology image restoration
Meiyan Liang
Qiannan Zhang
Guogang Wang
Na Xu
Lin Wang
Haishun Liu
Cunlin Zhang
The Visual Computer, 2023, 39 : 4305 - 4321
[8] Application of Self-Attention Generative Adversarial Network for Electromagnetic Imaging in Half-Space
Chiu, Chien-Ching
Lee, Yang-Han
Chen, Po-Hsiang
Shih, Ying-Chen
Hao, Jiang
SENSORS, 2024, 24 (07)
[9] VSEGAN: VISUAL SPEECH ENHANCEMENT GENERATIVE ADVERSARIAL NETWORK
Xu, Xinmeng
Wang, Yang
Xu, Dongxiang
Peng, Yiyuan
Zhang, Cong
Jia, Jie
Chen, Binbin
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7307 - 7311
[10] QAR Data Imputation Using Generative Adversarial Network with Self-Attention Mechanism
Zhao, Jingqi
Rong, Chuitian
Dang, Xin
Sun, Huabo
BIG DATA MINING AND ANALYTICS, 2024, 7 (01): : 12 - 28

← 1 2 3 4 5 →