Noise Prior Knowledge Learning for Speech Enhancement via Gated Convolutional Generative Adversarial Network

被引:0
|
作者
Fan, Cunhang [1 ,2 ]
Liu, Bin [1 ]
Tao, Jianhua [1 ,2 ,3 ]
Yi, Jiangyan [1 ]
Wen, Zhengqi [1 ]
Bai, Ye [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
[3] CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China
来源
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2019年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/apsipaasc47483.2019.9023216
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Speech enhancement generative adversarial network (SEGAN) is an end-to-end deep learning architecture, which only uses the clean speech as the training targets. However, when the signal-to-noise ratio (SNR) is very low, predicting clean speech signals could be very difficult as the speech is dominated by the noise. In order to address this problem, in this paper, we propose a gated convolutional neural network (CNN) SEGAN (GSEGAN) with noise prior knowledge learning to address this problem. The proposed model not only estimates the clean speech, but also learns the noise prior knowledge to assist the speech enhancement. In addition, gated CNN has an excellent potential for capturing long-term temporal dependencies than regular CNN. Motivated by this, we use a gated CNN architecture to acquire more detailed information at waveform level instead of regular CNN. We evaluate the proposed method GSEGAN on Voice Bank corpus. Experimental results show that the proposed method GSEGAN outperforms the SEGAN baseline, with a relative improvement of 0.7%, 28.2% and 43.9% for perceptual evaluation of speech quality (PESQ), overall Signal-to-Noise Ratio (SNRovl) and Segmental Signal-to-Noise Ratio (SNRseg), respectively.
引用
收藏
页码:662 / 666
页数:5
相关论文
共 50 条
  • [1] LANGUAGE AND NOISE TRANSFER IN SPEECH ENHANCEMENT GENERATIVE ADVERSARIAL NETWORK
    Pascual, Santiago
    Park, Maruchan
    Serra, Joan
    Bonafonte, Antonio
    Ahn, Kang-Hun
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5019 - 5023
  • [2] Speech Enhancement via Residual Dense Generative Adversarial Network
    Zhou, Lin
    Zhong, Qiuyue
    Wang, Tianyi
    Lu, Siyuan
    Hu, Hongmei
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2021, 38 (03): : 279 - 289
  • [3] SEGAN: Speech Enhancement Generative Adversarial Network
    Pascual, Santiago
    Bonafonte, Antonio
    Serra, Joan
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3642 - 3646
  • [4] A Convolutional Gated Recurrent Network for Speech Enhancement
    Yuan W.-H.
    Hu S.-D.
    Shi Y.-L.
    Li Z.
    Liang C.-Y.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2020, 48 (07): : 1276 - 1283
  • [5] iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric Learning
    Li, Haoyu
    Fu, Szu-Wei
    Tsao, Yu
    Yamagishi, Junichi
    INTERSPEECH 2020, 2020, : 1336 - 1340
  • [6] VSEGAN: VISUAL SPEECH ENHANCEMENT GENERATIVE ADVERSARIAL NETWORK
    Xu, Xinmeng
    Wang, Yang
    Xu, Dongxiang
    Peng, Yiyuan
    Zhang, Cong
    Jia, Jie
    Chen, Binbin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7307 - 7311
  • [7] GSC Based Speech Enhancement with Generative Adversarial Network
    Zhou, Yao
    Bao, Changchun
    Cheng, Rui
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 901 - 906
  • [8] Speech Enhancement Using Generative Adversarial Network (GAN)
    Huq, Mahmudul
    Maskeliunas, Rytis
    HYBRID INTELLIGENT SYSTEMS, HIS 2021, 2022, 420 : 273 - 282
  • [9] SPEECH ENHANCEMENT VIA GENERATIVE ADVERSARIAL LSTM NETWORKS
    Xiang, Yang
    Bao, Changchun
    2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 46 - 50
  • [10] Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method
    Wu, Jianfeng
    Hua, Yongzhu
    Yang, Shengying
    Qin, Hongshuai
    Qin, Huibin
    APPLIED SCIENCES-BASEL, 2019, 9 (16):