Noise Prior Knowledge Learning for Speech Enhancement via Gated Convolutional Generative Adversarial Network

被引：0

作者：

Fan, Cunhang ^{[1
,2
]}

Liu, Bin ^{[1
]}

Tao, Jianhua ^{[1
,2
,3
]}

Yi, Jiangyan ^{[1
]}

Wen, Zhengqi ^{[1
]}

Bai, Ye ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China

[3] CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China

来源：

2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2019年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/apsipaasc47483.2019.9023216

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Speech enhancement generative adversarial network (SEGAN) is an end-to-end deep learning architecture, which only uses the clean speech as the training targets. However, when the signal-to-noise ratio (SNR) is very low, predicting clean speech signals could be very difficult as the speech is dominated by the noise. In order to address this problem, in this paper, we propose a gated convolutional neural network (CNN) SEGAN (GSEGAN) with noise prior knowledge learning to address this problem. The proposed model not only estimates the clean speech, but also learns the noise prior knowledge to assist the speech enhancement. In addition, gated CNN has an excellent potential for capturing long-term temporal dependencies than regular CNN. Motivated by this, we use a gated CNN architecture to acquire more detailed information at waveform level instead of regular CNN. We evaluate the proposed method GSEGAN on Voice Bank corpus. Experimental results show that the proposed method GSEGAN outperforms the SEGAN baseline, with a relative improvement of 0.7%, 28.2% and 43.9% for perceptual evaluation of speech quality (PESQ), overall Signal-to-Noise Ratio (SNRovl) and Segmental Signal-to-Noise Ratio (SNRseg), respectively.

引用

页码：662 / 666

页数：5

共 50 条

[1] LANGUAGE AND NOISE TRANSFER IN SPEECH ENHANCEMENT GENERATIVE ADVERSARIAL NETWORK
Pascual, Santiago
Park, Maruchan
Serra, Joan
Bonafonte, Antonio
Ahn, Kang-Hun
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5019 - 5023
[2] Speech Enhancement via Residual Dense Generative Adversarial Network
Zhou, Lin
Zhong, Qiuyue
Wang, Tianyi
Lu, Siyuan
Hu, Hongmei
COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2021, 38 (03): : 279 - 289
[3] SEGAN: Speech Enhancement Generative Adversarial Network
Pascual, Santiago
Bonafonte, Antonio
Serra, Joan
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3642 - 3646
[4] A Convolutional Gated Recurrent Network for Speech Enhancement
Yuan W.-H.
Hu S.-D.
Shi Y.-L.
Li Z.
Liang C.-Y.
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2020, 48 (07): : 1276 - 1283
[5] iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric Learning
Li, Haoyu
Fu, Szu-Wei
Tsao, Yu
Yamagishi, Junichi
INTERSPEECH 2020, 2020, : 1336 - 1340
[6] VSEGAN: VISUAL SPEECH ENHANCEMENT GENERATIVE ADVERSARIAL NETWORK
Xu, Xinmeng
Wang, Yang
Xu, Dongxiang
Peng, Yiyuan
Zhang, Cong
Jia, Jie
Chen, Binbin
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7307 - 7311
[7] GSC Based Speech Enhancement with Generative Adversarial Network
Zhou, Yao
Bao, Changchun
Cheng, Rui
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 901 - 906
[8] Speech Enhancement Using Generative Adversarial Network (GAN)
Huq, Mahmudul
Maskeliunas, Rytis
HYBRID INTELLIGENT SYSTEMS, HIS 2021, 2022, 420 : 273 - 282
[9] SPEECH ENHANCEMENT VIA GENERATIVE ADVERSARIAL LSTM NETWORKS
Xiang, Yang
Bao, Changchun
2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 46 - 50
[10] Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method
Wu, Jianfeng
Hua, Yongzhu
Yang, Shengying
Qin, Hongshuai
Qin, Huibin
APPLIED SCIENCES-BASEL, 2019, 9 (16):

← 1 2 3 4 5 →