SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT

被引：23

作者：

Huy Phan ^{[1
]}

Nguyen, Huy Le ^{[2
]}

Chen, Oliver Y. ^{[3
]}

Koch, Philipp ^{[4
]}

Duong, Ngoc Q. K. ^{[5
]}

McLoughlin, Ian ^{[6
]}

Mertins, Alfred ^{[4
]}

机构：

[1] Queen Mary Univ London, London, England

[2] HCMC Univ Technol, Ho Chi Minh City, Vietnam

[3] Univ Oxford, Oxford, England

[4] Univ Lubeck, Lubeck, Germany

[5] InterDigital R&D France, Paris, France

[6] Singapore Inst Technol, Singapore, Singapore

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

关键词：

Speech enhancement; self-attention; generative adversarial network; GAN; SEGAN;

D O I：

10.1109/ICASSP39728.2021.9414265

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input. To remedy this issue, we propose a self-attention layer adapted from non-local attention, coupled with the convolutional and deconvolutional layers of a speech enhancement GAN (SEGAN) using raw signal input. Further, we empirically study the effect of placing the self-attention layer at the (de)convolutional layers with varying layer indices as well as at all of them when memory allows. Our experiments show that introducing self-attention to SEGAN leads to consistent improvement across the objective evaluation metrics of enhancement performance. Furthermore, applying at different (de)convolutional layers does not significantly alter performance, suggesting that it can be conveniently applied at the highest-level (de)convolutional layer with the smallest memory overhead(1).

引用

收藏

页码：7103 / 7107

页数：5

相关论文

共 50 条

[21] Stroke Electroencephalogram Data Synthesizing through Progressive Efficient Self-Attention Generative Adversarial Network [J].

Wang, Suzhe ;

Zhang, Xueying ;

Li, Fenglian ;

Wu, Zelin .

CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 81 (01) :1177-1196

[22] Conditional self-attention generative adversarial network with differential evolution algorithm for imbalanced data classification [J].

Niu, Jiawei ;

Liu, Zhunga ;

Pan, Quan ;

Yang, Yanbo ;

LI, Yang .

CHINESE JOURNAL OF AERONAUTICS, 2023, 36 (03) :303-315

[23] Conditional self-attention generative adversarial network with differential evolution algorithm for imbalanced data classification [J].

Jiawei NIU ;

Zhunga LIU ;

Quan PAN ;

Yanbo YANG ;

Yang LI .

Chinese Journal of Aeronautics , 2023, (03) :303-315

[24] LinesToFacePhoto: Face Photo Generation From Lines With Conditional Self-Attention Generative Adversarial Network [J].

Li, Yuhang ;

Chen, Xuejin ;

Wu, Feng ;

Zha, Zheng-Jun .

PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, :2323-2331

[25] On the Use of Audio Fingerprinting Features for Speech Enhancement with Generative Adversarial Network [J].

Faraji, Farnood ;

Attabi, Yazid ;

Champagne, Benoit ;

Zhu, Wei-Ping .

2020 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2020, :77-82

[26] A Dual Stream Generative Adversarial Network with Phase Awareness for Speech Enhancement [J].

Liang, Xintao ;

Li, Yuhang ;

Li, Xiaomin ;

Zhang, Yue ;

Ding, Youdong .

INFORMATION, 2023, 14 (04)

[27] Self-attention Handwriting Generative Model [J].

Wang, Yu-Chiao ;

Hsieh, Tung-Ju ;

Chiang, Pei-Ying .

PROCEEDINGS SIGGRAPH ASIA 2024 POSTERS, 2024,

[28] Missing Data Repairs for Traffic Flow With Self-Attention Generative Adversarial Imputation Net [J].

Zhang, Weibin ;

Zhang, Pulin ;

Yu, Yinghao ;

Li, Xiying ;

Biancardo, Salvatore Antonio ;

Zhang, Junyi .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (07) :7919-7930

[29] Social Self-Attention Generative Adversarial Networks for Human Trajectory Prediction [J].

Yang C. ;

Pan H. ;

Sun W. ;

Gao H. .

IEEE Transactions on Artificial Intelligence, 2024, 5 (04) :1805-1815

[30] DNN-based speech enhancement with self-attention on feature dimension [J].

Cheng, Jiaming ;

Liang, Ruiyu ;

Zhao, Li .

MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (43-44) :32449-32470

← 1 2 3 4 5 →