SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT

被引：18

作者：

Huy Phan ^{[1
]}

Nguyen, Huy Le ^{[2
]}

Chen, Oliver Y. ^{[3
]}

Koch, Philipp ^{[4
]}

Duong, Ngoc Q. K. ^{[5
]}

McLoughlin, Ian ^{[6
]}

Mertins, Alfred ^{[4
]}

机构：

[1] Queen Mary Univ London, London, England

[2] HCMC Univ Technol, Ho Chi Minh City, Vietnam

[3] Univ Oxford, Oxford, England

[4] Univ Lubeck, Lubeck, Germany

[5] InterDigital R&D France, Paris, France

[6] Singapore Inst Technol, Singapore, Singapore

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

关键词：

Speech enhancement; self-attention; generative adversarial network; GAN; SEGAN;

D O I：

10.1109/ICASSP39728.2021.9414265

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input. To remedy this issue, we propose a self-attention layer adapted from non-local attention, coupled with the convolutional and deconvolutional layers of a speech enhancement GAN (SEGAN) using raw signal input. Further, we empirically study the effect of placing the self-attention layer at the (de)convolutional layers with varying layer indices as well as at all of them when memory allows. Our experiments show that introducing self-attention to SEGAN leads to consistent improvement across the objective evaluation metrics of enhancement performance. Furthermore, applying at different (de)convolutional layers does not significantly alter performance, suggesting that it can be conveniently applied at the highest-level (de)convolutional layer with the smallest memory overhead(1).

引用

页码：7103 / 7107

页数：5

共 50 条

[31] A Novel Small Samples Fault Diagnosis Method Based on the Self-attention Wasserstein Generative Adversarial Network
Zhiwu Shang
Jie Zhang
Wanxiang Li
Shiqi Qian
Jingyu Liu
Maosheng Gao
[J]. Neural Processing Letters, 2023, 55 : 6377 - 6407
[32] SUPER-RESOLUTION AND SELF-ATTENTION WITH GENERATIVE ADVERSARIAL NETWORK FOR IMPROVING MALIGNANCY CHARACTERIZATION OF HEPATOCELLULAR CARCINOMA
Li, Yunling
Huang, Hui
Zhang, Lijuan
Wang, Guangyi
Zhang, Honglai
Zhou, Wu
[J]. 2020 IEEE 17TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2020), 2020, : 1556 - 1560
[33] Occluded offline handwritten Chinese character inpainting via generative adversarial network and self-attention mechanism
Song, Ge
Li, Jianwu
Wang, Zheng
[J]. NEUROCOMPUTING, 2020, 415 : 146 - 156
[34] LANGUAGE AND NOISE TRANSFER IN SPEECH ENHANCEMENT GENERATIVE ADVERSARIAL NETWORK
Pascual, Santiago
Park, Maruchan
Serra, Joan
Bonafonte, Antonio
Ahn, Kang-Hun
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5019 - 5023
[35] Speech Enhancement via Residual Dense Generative Adversarial Network
Zhou, Lin
Zhong, Qiuyue
Wang, Tianyi
Lu, Siyuan
Hu, Hongmei
[J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2021, 38 (03): : 279 - 289
[36] Improved Wasserstein conditional generative adversarial network speech enhancement
Qin, Shan
Jiang, Ting
[J]. EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2018,
[37] Improved Wasserstein conditional generative adversarial network speech enhancement
Shan Qin
Ting Jiang
[J]. EURASIP Journal on Wireless Communications and Networking, 2018
[38] A Loss With Mixed Penalty for Speech Enhancement Generative Adversarial Network
Cao, Jie
Zhou, Yaofeng
Yu, Hong
Li, Xiaoxu
Wang, Dan
Ma, Zhanyu
[J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 86 - 90
[39] CP-GAN: CONTEXT PYRAMID GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT
Liu, Gang
Gong, Ke
Liang, Xiaodan
Chen, Zhiguang
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6624 - 6628
[40] Research on clothing patterns generation based on multi-scales self-attention improved generative adversarial network
Yu, Zi-yan
Luo, Tian-jian
[J]. INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2021, 14 (04) : 647 - 663

← 1 2 3 4 5 →