SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT

被引:18
|
作者
Huy Phan [1 ]
Nguyen, Huy Le [2 ]
Chen, Oliver Y. [3 ]
Koch, Philipp [4 ]
Duong, Ngoc Q. K. [5 ]
McLoughlin, Ian [6 ]
Mertins, Alfred [4 ]
机构
[1] Queen Mary Univ London, London, England
[2] HCMC Univ Technol, Ho Chi Minh City, Vietnam
[3] Univ Oxford, Oxford, England
[4] Univ Lubeck, Lubeck, Germany
[5] InterDigital R&D France, Paris, France
[6] Singapore Inst Technol, Singapore, Singapore
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年
关键词
Speech enhancement; self-attention; generative adversarial network; GAN; SEGAN;
D O I
10.1109/ICASSP39728.2021.9414265
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input. To remedy this issue, we propose a self-attention layer adapted from non-local attention, coupled with the convolutional and deconvolutional layers of a speech enhancement GAN (SEGAN) using raw signal input. Further, we empirically study the effect of placing the self-attention layer at the (de)convolutional layers with varying layer indices as well as at all of them when memory allows. Our experiments show that introducing self-attention to SEGAN leads to consistent improvement across the objective evaluation metrics of enhancement performance. Furthermore, applying at different (de)convolutional layers does not significantly alter performance, suggesting that it can be conveniently applied at the highest-level (de)convolutional layer with the smallest memory overhead(1).
引用
收藏
页码:7103 / 7107
页数:5
相关论文
共 50 条
  • [1] SASEGAN-TCN: Speech enhancement algorithm based on self-attention generative adversarial network and temporal convolutional network
    Lv R.
    Chen N.
    Cheng S.
    Fan G.
    Rao L.
    Song X.
    Lv W.
    Yang D.
    Mathematical Biosciences and Engineering, 2024, 21 (03) : 3860 - 3875
  • [2] Self-attention generative adversarial network with the conditional constraint
    Jia Y.
    Ma L.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (06): : 163 - 170
  • [3] Light-Weight Self-Attention Augmented Generative Adversarial Networks for Speech Enhancement
    Li, Lujun
    Lu, Zhenxing
    Watzel, Tobias
    Kurzinger, Ludwig
    Rigoll, Gerhard
    ELECTRONICS, 2021, 10 (13)
  • [4] Dialogue Generation Using Self-Attention Generative Adversarial Network
    Hatua, Amartya
    Nguyen, Trung T.
    Sung, Andrew H.
    2019 IEEE INTERNATIONAL CONFERENCE ON CONVERSATIONAL DATA & KNOWLEDGE ENGINEERING (CDKE), 2019, : 33 - 38
  • [5] Multi-scale self-attention generative adversarial network for pathology image restoration
    Liang, Meiyan
    Zhang, Qiannan
    Wang, Guogang
    Xu, Na
    Wang, Lin
    Liu, Haishun
    Zhang, Cunlin
    VISUAL COMPUTER, 2023, 39 (09) : 4305 - 4321
  • [6] BaMSGAN: Self-Attention Generative Adversarial Network with Blur and Memory for Anime Face Generation
    Li, Xu
    Li, Bowei
    Fang, Minghao
    Huang, Rui
    Huang, Xiaoran
    MATHEMATICS, 2023, 11 (20)
  • [7] Multi-scale self-attention generative adversarial network for pathology image restoration
    Meiyan Liang
    Qiannan Zhang
    Guogang Wang
    Na Xu
    Lin Wang
    Haishun Liu
    Cunlin Zhang
    The Visual Computer, 2023, 39 : 4305 - 4321
  • [8] Application of Self-Attention Generative Adversarial Network for Electromagnetic Imaging in Half-Space
    Chiu, Chien-Ching
    Lee, Yang-Han
    Chen, Po-Hsiang
    Shih, Ying-Chen
    Hao, Jiang
    SENSORS, 2024, 24 (07)
  • [9] VSEGAN: VISUAL SPEECH ENHANCEMENT GENERATIVE ADVERSARIAL NETWORK
    Xu, Xinmeng
    Wang, Yang
    Xu, Dongxiang
    Peng, Yiyuan
    Zhang, Cong
    Jia, Jie
    Chen, Binbin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7307 - 7311
  • [10] QAR Data Imputation Using Generative Adversarial Network with Self-Attention Mechanism
    Zhao, Jingqi
    Rong, Chuitian
    Dang, Xin
    Sun, Huabo
    BIG DATA MINING AND ANALYTICS, 2024, 7 (01): : 12 - 28