CP-GAN: CONTEXT PYRAMID GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT

被引:0
|
作者
Liu, Gang [1 ]
Gong, Ke [2 ]
Liang, Xiaodan [1 ]
Chen, Zhiguang [1 ]
机构
[1] Sun Yat Sen Univ, Guangzhou, Guangdong, Peoples R China
[2] DarkMatter AI Res, Guangzhou, Guangdong, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
speech enhancement; generative adversarial network; context pyramid;
D O I
10.1109/icassp40776.2020.9054060
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The topic of speech enhancement has been largely improved recently, especially with the development of generative adversarial networks (GANs). However prior methods simply follow the GAN architectures from computer vision tasks without specific designs for the speech enhancement according to the audio characteristics (i.e., different granularity context), which may leave noise points in some segments or disturb the contents of the original audio. In this work, we make the first attempt to explore the global and local speech features for coarse-to-fine speech enhancement and introduce a Context Pyramid Generative Adversarial Network (CP-GAN), which contains a densely-connected feature pyramid generator and a dynamic context granularity discriminator to better eliminate audio noise hierarchically. Extensive experiments demonstrate that our CP-GAN effectively achieves state-of-the-art speech enhancement results and boosts the performance of more high-level speech tasks including automatic speech recognition and speaker recognition.
引用
收藏
页码:6624 / 6628
页数:5
相关论文
共 50 条
  • [1] Speech Enhancement Using Generative Adversarial Network (GAN)
    Huq, Mahmudul
    Maskeliunas, Rytis
    HYBRID INTELLIGENT SYSTEMS, HIS 2021, 2022, 420 : 273 - 282
  • [2] Enhancement of Alaryngeal Speech using Generative Adversarial Network (GAN)
    Huq, Mahmudul
    2021 IEEE/ACS 18TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2021,
  • [3] SEGAN: Speech Enhancement Generative Adversarial Network
    Pascual, Santiago
    Bonafonte, Antonio
    Serra, Joan
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3642 - 3646
  • [4] TFDense-GAN: a generative adversarial network for single-channel speech enhancement
    Chen, Haoxiang
    Zhang, Jinxiu
    Fu, Yaogang
    Zhou, Xintong
    Wang, Ruilong
    Xu, Yanyan
    Ke, Dengfeng
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2025, 2025 (01):
  • [5] VSEGAN: VISUAL SPEECH ENHANCEMENT GENERATIVE ADVERSARIAL NETWORK
    Xu, Xinmeng
    Wang, Yang
    Xu, Dongxiang
    Peng, Yiyuan
    Zhang, Cong
    Jia, Jie
    Chen, Binbin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7307 - 7311
  • [6] GSC Based Speech Enhancement with Generative Adversarial Network
    Zhou, Yao
    Bao, Changchun
    Cheng, Rui
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 901 - 906
  • [7] LANGUAGE AND NOISE TRANSFER IN SPEECH ENHANCEMENT GENERATIVE ADVERSARIAL NETWORK
    Pascual, Santiago
    Park, Maruchan
    Serra, Joan
    Bonafonte, Antonio
    Ahn, Kang-Hun
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5019 - 5023
  • [8] Speech Enhancement via Residual Dense Generative Adversarial Network
    Zhou, Lin
    Zhong, Qiuyue
    Wang, Tianyi
    Lu, Siyuan
    Hu, Hongmei
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2021, 38 (03): : 279 - 289
  • [9] SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT
    Huy Phan
    Nguyen, Huy Le
    Chen, Oliver Y.
    Koch, Philipp
    Duong, Ngoc Q. K.
    McLoughlin, Ian
    Mertins, Alfred
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7103 - 7107
  • [10] Improved Wasserstein conditional generative adversarial network speech enhancement
    Qin, Shan
    Jiang, Ting
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2018,