SASEGAN-TCN: Speech enhancement algorithm based on self-attention generative adversarial network and temporal convolutional network

被引：1

作者：

Lv R. ^{[1
]}

Chen N. ^{[1
]}

Cheng S. ^{[1
]}

Fan G. ^{[1
]}

Rao L. ^{[1
]}

Song X. ^{[1
]}

Lv W. ^{[2
]}

Yang D. ^{[3
]}

机构：

[1] School of Electronic Information Engineering, Shanghai Dianji University, Shanghai

[2] School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai

[3] Alibaba Group, Shanghai

来源：

Mathematical Biosciences and Engineering | 2024年 / 21卷 / 03期

基金：

中国国家自然科学基金;

关键词：

autoencoder; deep learning; generative adversarial network; speech enhancement;

D O I：

10.3934/mbe.2024172

中图分类号：

学科分类号：

摘要：

Traditional unsupervised speech enhancement models often have problems such as non-aggregation of input feature information, which will introduce additional noise during training, thereby reducing the quality of the speech signal. In order to solve the above problems, this paper analyzed the impact of problems such as non-aggregation of input speech feature information on its performance. Moreover, this article introduced a temporal convolutional neural network and proposed a SASEGAN-TCN speech enhancement model, which captured local features information and aggregated global feature information to improve model effect and training stability. The simulation experiment results showed that the model can achieve 2.1636 and 92.78% in perceptual evaluation of speech quality (PESQ) score and short-time objective intelligibility (STOI) on the Valentini dataset, and can accordingly reach 1.8077 and 83.54% on the THCHS30 dataset. In addition, this article used the enhanced speech data for the acoustic model to verify the recognition accuracy. The speech recognition error rate was reduced by 17.4%, which was a significant improvement compared to the baseline model experimental results. © 2024 the Author(s).

引用

页码：3860 / 3875

页数：15

共 50 条

[41] Speech Enhancement of Complex Convolutional Recurrent Network with Attention
Jiangjiao Zeng
Lidong Yang
Circuits, Systems, and Signal Processing, 2023, 42 : 1834 - 1847
[42] Unsupervised unpaired multiple fusion adaptation aided with self-attention generative adversarial network for scar tissues segmentation framework
Qayyum, Abdul
Razzak, Imran
Mazher, Moona
Lu, Xuequan
Niederer, Steven A.
INFORMATION FUSION, 2024, 106
[43] Occluded offline handwritten Chinese character inpainting via generative adversarial network and self-attention mechanism
Song, Ge
Li, Jianwu
Wang, Zheng
NEUROCOMPUTING, 2020, 415 : 146 - 156
[44] MULTI-SCALE TEMPORAL FREQUENCY CONVOLUTIONAL NETWORK WITH AXIAL ATTENTION FOR SPEECH ENHANCEMENT
Zhang, Guochang
Yu, Libiao
Wang, Chunliang
Wei, Jianqiang
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 9122 - 9126
[45] CT and MRI fusion based on generative adversarial network and convolutional neural networks under image enhancement
Liu Y.
Li J.
Wang Y.
Cai W.
Chen F.
Liu W.
Mao X.
Gan K.
Wang R.
Sun D.
Qiu H.
Liu B.
Shengwu Yixue Gongchengxue Zazhi/Journal of Biomedical Engineering, 2023, 40 (02): : 208 - 216
[46] A Dual Stream Generative Adversarial Network with Phase Awareness for Speech Enhancement
Liang, Xintao
Li, Yuhang
Li, Xiaomin
Zhang, Yue
Ding, Youdong
INFORMATION, 2023, 14 (04)
[47] LumiNet: Multispatial Attention Generative Adversarial Network for Backlit Image Enhancement
Bose, Samprit
Nawale, Sahil
Khut, Dhruv
Kolekar, Maheshkumar H.
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[48] Calligraphy generation algorithm based on improved generative adversarial network
Li Y.-H.
Duan J.-J.
Su X.-P.
Zhang L.-T.
Yu H.-K.
Liu X.-R.
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (07): : 1326 - 1334+1459
[49] Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method
Wu, Jianfeng
Hua, Yongzhu
Yang, Shengying
Qin, Hongshuai
Qin, Huibin
APPLIED SCIENCES-BASEL, 2019, 9 (16):
[50] TFDense-GAN: a generative adversarial network for single-channel speech enhancement
Chen, Haoxiang
Zhang, Jinxiu
Fu, Yaogang
Zhou, Xintong
Wang, Ruilong
Xu, Yanyan
Ke, Dengfeng
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2025, 2025 (01):

← 1 2 3 4 5 →