SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT

被引:23
作者
Huy Phan [1 ]
Nguyen, Huy Le [2 ]
Chen, Oliver Y. [3 ]
Koch, Philipp [4 ]
Duong, Ngoc Q. K. [5 ]
McLoughlin, Ian [6 ]
Mertins, Alfred [4 ]
机构
[1] Queen Mary Univ London, London, England
[2] HCMC Univ Technol, Ho Chi Minh City, Vietnam
[3] Univ Oxford, Oxford, England
[4] Univ Lubeck, Lubeck, Germany
[5] InterDigital R&D France, Paris, France
[6] Singapore Inst Technol, Singapore, Singapore
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年
关键词
Speech enhancement; self-attention; generative adversarial network; GAN; SEGAN;
D O I
10.1109/ICASSP39728.2021.9414265
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input. To remedy this issue, we propose a self-attention layer adapted from non-local attention, coupled with the convolutional and deconvolutional layers of a speech enhancement GAN (SEGAN) using raw signal input. Further, we empirically study the effect of placing the self-attention layer at the (de)convolutional layers with varying layer indices as well as at all of them when memory allows. Our experiments show that introducing self-attention to SEGAN leads to consistent improvement across the objective evaluation metrics of enhancement performance. Furthermore, applying at different (de)convolutional layers does not significantly alter performance, suggesting that it can be conveniently applied at the highest-level (de)convolutional layer with the smallest memory overhead(1).
引用
收藏
页码:7103 / 7107
页数:5
相关论文
共 50 条
[41]   CP-GAN: CONTEXT PYRAMID GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT [J].
Liu, Gang ;
Gong, Ke ;
Liang, Xiaodan ;
Chen, Zhiguang .
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, :6624-6628
[42]   Speech enhancement method based on the multi-head self-attention mechanism [J].
Chang X. ;
Zhang Y. ;
Yang L. ;
Kou J. ;
Wang X. ;
Xu D. .
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2020, 47 (01) :104-110
[43]   Research on clothing patterns generation based on multi-scales self-attention improved generative adversarial network [J].
Yu, Zi-yan ;
Luo, Tian-jian .
INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2021, 14 (04) :647-663
[44]   Defense method of smart grid GPS spoofing attack based on improved self-attention generative adversarial network [J].
Li Y. ;
Yang S. .
Dianli Zidonghua Shebei/Electric Power Automation Equipment, 2021, 41 (11) :100-106
[45]   Enhancing Automatic Speech Recognition Quality with a Second-Stage Speech Enhancement Generative Adversarial Network [J].
Nossier, Soha A. ;
Wall, Julie ;
Moniri, Mansour ;
Glackin, Cornelius ;
Cannings, Nigel .
2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2023, :546-552
[46]   TFDense-GAN: a generative adversarial network for single-channel speech enhancement [J].
Chen, Haoxiang ;
Zhang, Jinxiu ;
Fu, Yaogang ;
Zhou, Xintong ;
Wang, Ruilong ;
Xu, Yanyan ;
Ke, Dengfeng .
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2025, 2025 (01)
[47]   Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method [J].
Wu, Jianfeng ;
Hua, Yongzhu ;
Yang, Shengying ;
Qin, Hongshuai ;
Qin, Huibin .
APPLIED SCIENCES-BASEL, 2019, 9 (16)
[48]   Missing Data Imputation for Online Monitoring of Power Equipment Based on Self-attention Generative Adversarial Networks [J].
Zhou Y. ;
Lin M. ;
Chen J. ;
Bai Z. ;
Chen M. .
Gaodianya Jishu/High Voltage Engineering, 2023, 49 (05) :1795-1809
[49]   A generative adversarial network with multiscale and attention mechanisms for underwater image enhancement [J].
Zhao, Liquan ;
Li, Yuda ;
Zhong, Tie .
SCIENTIFIC REPORTS, 2025, 15 (01)
[50]   Self-Attention Recurrent Conditional Generative Adversarial Networks for Corporate Credit Rating Prediction [J].
Lin, Shu-Ying ;
Wang, An-Chi .
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2023, 39 (05) :1209-1230