SASEGAN-TCN: Speech enhancement algorithm based on self-attention generative adversarial network and temporal convolutional network

被引：1

作者：

Lv R. ^{[1
]}

Chen N. ^{[1
]}

Cheng S. ^{[1
]}

Fan G. ^{[1
]}

Rao L. ^{[1
]}

Song X. ^{[1
]}

Lv W. ^{[2
]}

Yang D. ^{[3
]}

机构：

[1] School of Electronic Information Engineering, Shanghai Dianji University, Shanghai

[2] School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai

[3] Alibaba Group, Shanghai

来源：

Mathematical Biosciences and Engineering | 2024年 / 21卷 / 03期

基金：

中国国家自然科学基金;

关键词：

autoencoder; deep learning; generative adversarial network; speech enhancement;

D O I：

10.3934/mbe.2024172

中图分类号：

学科分类号：

摘要：

Traditional unsupervised speech enhancement models often have problems such as non-aggregation of input feature information, which will introduce additional noise during training, thereby reducing the quality of the speech signal. In order to solve the above problems, this paper analyzed the impact of problems such as non-aggregation of input speech feature information on its performance. Moreover, this article introduced a temporal convolutional neural network and proposed a SASEGAN-TCN speech enhancement model, which captured local features information and aggregated global feature information to improve model effect and training stability. The simulation experiment results showed that the model can achieve 2.1636 and 92.78% in perceptual evaluation of speech quality (PESQ) score and short-time objective intelligibility (STOI) on the Valentini dataset, and can accordingly reach 1.8077 and 83.54% on the THCHS30 dataset. In addition, this article used the enhanced speech data for the acoustic model to verify the recognition accuracy. The speech recognition error rate was reduced by 17.4%, which was a significant improvement compared to the baseline model experimental results. © 2024 the Author(s).

引用

页码：3860 / 3875

页数：15

共 50 条

[1] SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT
Huy Phan
Nguyen, Huy Le
Chen, Oliver Y.
Koch, Philipp
Duong, Ngoc Q. K.
McLoughlin, Ian
Mertins, Alfred
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7103 - 7107
[2] Self-attention generative adversarial network with the conditional constraint
Jia Y.
Ma L.
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (06): : 163 - 170
[3] Generative Adversarial Network Based on Self-Attention Mechanism for Automatic Page Layout Generation
Sun, Peng
Liu, Xiaomei
Weng, Liguo
Liu, Ziheng
APPLIED SCIENCES-BASEL, 2025, 15 (05):
[4] SEGAN: Speech Enhancement Generative Adversarial Network
Pascual, Santiago
Bonafonte, Antonio
Serra, Joan
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3642 - 3646
[5] Conditional self-attention generative adversarial network with differential evolution algorithm for imbalanced data classification
Jiawei NIU
Zhunga LIU
Quan PAN
Yanbo YANG
Yang LI
Chinese Journal of Aeronautics , 2023, (03) : 303 - 315
[6] Conditional self-attention generative adversarial network with differential evolution algorithm for imbalanced data classification
Niu, Jiawei
Liu, Zhunga
Pan, Quan
Yang, Yanbo
LI, Yang
CHINESE JOURNAL OF AERONAUTICS, 2023, 36 (03) : 303 - 315
[7] Improved self-attention generative adversarial adaptation network-based melanoma classification
Gowthami, S.
Harikumar, R.
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (03) : 4113 - 4122
[8] Time-Frequency Mask-based Speech Enhancement using Convolutional Generative Adversarial Network
Shah, Neil
Patil, Hemant A.
Soni, Meet H.
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1246 - 1251
[9] Light-Weight Self-Attention Augmented Generative Adversarial Networks for Speech Enhancement
Li, Lujun
Lu, Zhenxing
Watzel, Tobias
Kurzinger, Ludwig
Rigoll, Gerhard
ELECTRONICS, 2021, 10 (13)
[10] VSEGAN: VISUAL SPEECH ENHANCEMENT GENERATIVE ADVERSARIAL NETWORK
Xu, Xinmeng
Wang, Yang
Xu, Dongxiang
Peng, Yiyuan
Zhang, Cong
Jia, Jie
Chen, Binbin
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7307 - 7311

← 1 2 3 4 5 →