SASEGAN-TCN: Speech enhancement algorithm based on self-attention generative adversarial network and temporal convolutional network

被引:1
|
作者
Lv R. [1 ]
Chen N. [1 ]
Cheng S. [1 ]
Fan G. [1 ]
Rao L. [1 ]
Song X. [1 ]
Lv W. [2 ]
Yang D. [3 ]
机构
[1] School of Electronic Information Engineering, Shanghai Dianji University, Shanghai
[2] School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai
[3] Alibaba Group, Shanghai
基金
中国国家自然科学基金;
关键词
autoencoder; deep learning; generative adversarial network; speech enhancement;
D O I
10.3934/mbe.2024172
中图分类号
学科分类号
摘要
Traditional unsupervised speech enhancement models often have problems such as non-aggregation of input feature information, which will introduce additional noise during training, thereby reducing the quality of the speech signal. In order to solve the above problems, this paper analyzed the impact of problems such as non-aggregation of input speech feature information on its performance. Moreover, this article introduced a temporal convolutional neural network and proposed a SASEGAN-TCN speech enhancement model, which captured local features information and aggregated global feature information to improve model effect and training stability. The simulation experiment results showed that the model can achieve 2.1636 and 92.78% in perceptual evaluation of speech quality (PESQ) score and short-time objective intelligibility (STOI) on the Valentini dataset, and can accordingly reach 1.8077 and 83.54% on the THCHS30 dataset. In addition, this article used the enhanced speech data for the acoustic model to verify the recognition accuracy. The speech recognition error rate was reduced by 17.4%, which was a significant improvement compared to the baseline model experimental results. © 2024 the Author(s).
引用
收藏
页码:3860 / 3875
页数:15
相关论文
共 50 条
  • [21] A Dynamic Temporal Self-attention Graph Convolutional Network for Traffic Prediction
    Jiang, Ruiyuan
    Wang, Shangbo
    Zhang, Yuli
    arXiv, 2023,
  • [22] Infrared Image Enhancement Method of Substation Equipment Based on Self-Attention Cycle Generative Adversarial Network (SA-CycleGAN)
    Wang, Yuanbin
    Wu, Bingchao
    ELECTRONICS, 2024, 13 (17)
  • [23] Temporal Convolutional Network with Frequency Dimension Adaptive Attention for Speech Enhancement
    Zhang, Qiquan
    Song, Qi
    Nicolson, Aaron
    Lan, Tian
    Li, Haizhou
    INTERSPEECH 2021, 2021, : 166 - 170
  • [24] Self-attention based generative adversarial network with Aquila optimization algorithm espoused energy aware cluster head selection in WSN
    Soundararajan, S.
    Bapu, B. R. Tapas
    Sargunavathi, S.
    Poonguzhali, I.
    INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2024, 37 (05)
  • [25] Time-Frequency Mask-based Speech Enhancement using Convolutional Generative Adversarial Network
    Shah, Neil
    Patil, Hemant A.
    Soni, Meet H.
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1246 - 1251
  • [26] A Novel Small Samples Fault Diagnosis Method Based on the Self-attention Wasserstein Generative Adversarial Network
    Shang, Zhiwu
    Zhang, Jie
    Li, Wanxiang
    Qian, Shiqi
    Liu, Jingyu
    Gao, Maosheng
    NEURAL PROCESSING LETTERS, 2023, 55 (05) : 6377 - 6407
  • [27] Unsupervised Monocular Depth Estimation and Visual Odometry Based on Generative Adversarial Network and Self-attention Mechanism
    Ye X.
    He Y.
    Ru S.
    Jiqiren/Robot, 2021, 43 (02): : 203 - 213
  • [28] A Novel Small Samples Fault Diagnosis Method Based on the Self-attention Wasserstein Generative Adversarial Network
    Zhiwu Shang
    Jie Zhang
    Wanxiang Li
    Shiqi Qian
    Jingyu Liu
    Maosheng Gao
    Neural Processing Letters, 2023, 55 : 6377 - 6407
  • [29] Self-attention convolutional neural network based fault diagnosis algorithm for chemical process
    Ren Jia
    Zou Hongrui
    Tang Lijuan
    Sun Siyu
    Shen Qihao
    Wang Xiang
    Bao Ke
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 4046 - 4051
  • [30] Multi-scale self-attention generative adversarial network for pathology image restoration
    Liang, Meiyan
    Zhang, Qiannan
    Wang, Guogang
    Xu, Na
    Wang, Lin
    Liu, Haishun
    Zhang, Cunlin
    VISUAL COMPUTER, 2023, 39 (09): : 4305 - 4321