CycleGAN-based speech enhancement for the unpaired training data

被引:0
作者
Yuan, Jing [1 ]
Bao, Changchun [1 ]
机构
[1] Beijing Univ Technol, Beijing, Peoples R China
来源
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2019年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/apsipaasc47483.2019.9023072
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Speech enhancement is an important task of improving speech quality in noise scenario. Many speech enhancement methods have achieved remarkable success based on the paired data. However, for many tasks, the paired training data is not available. In this paper, we present a speech enhancement method for the unpaired data based on cycle-consistent generative adversarial network (CycleGAN) that can minimize the reconstruction loss as much as possible. The proposed model employs two discriminators and two generators to preserve speech components and reduce noise so that the network could map features better for the unseen noise. In this method, the generators are used to generate the enhanced speech, and two discriminators are employed to discriminate real inputs and the outputs of the generators. The experimental results showed that the proposed method effectively improved the performance compared to traditional deep neural network (DNN) and the recent GAN-based speech enhancement methods.
引用
收藏
页码:878 / 883
页数:6
相关论文
共 50 条
[21]   A Modified CycleGAN for Multi-Organ Ultrasound Image Enhancement via Unpaired Pre-Training [J].
Haonan Han ;
Bingyu Yang ;
Weihang Zhang ;
Dongwei Li ;
Huiqi Li .
Journal of Beijing Institute of Technology, 2024, (03) :194-203
[22]   Improving Oracle Bone Characters Recognition via A CycleGAN-Based Data Augmentation Method [J].
Wang, Wei ;
Zhang, Ting ;
Zhao, Yiwen ;
Jin, Xinxin ;
Mouchere, Harold ;
Yu, Xinguo .
NEURAL INFORMATION PROCESSING, ICONIP 2022, PT VI, 2023, 1793 :88-100
[23]   DECOMPOSED CYCLEGAN FOR SINGLE IMAGE DERAINING WITH UNPAIRED DATA [J].
Han, Kewen ;
Xiang, Xinguang .
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, :1828-1832
[24]   CycleGAN-Based Transfer Learning for Efficient Pathogenic Bacteria Classification [J].
Yu, Meng ;
Chen, Ruobing .
PROCEEDINGS OF 2025 5TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND INTELLIGENT COMPUTING, BIC 2025, 2025, :116-120
[25]   CycleGAN-Based Data Augmentation for Subgrade Disease Detection in GPR Images with YOLOv5 [J].
Yang, Yang ;
Huang, Limin ;
Zhang, Zhihou ;
Zhang, Jian ;
Zhao, Guangmao .
ELECTRONICS, 2024, 13 (05)
[26]   CYCLEGAN-VC2: IMPROVED CYCLEGAN-BASED NON-PARALLEL VOICE CONVERSION [J].
Kaneko, Takuhiro ;
Kameoka, Hirokazu ;
Tanaka, Kou ;
Hojo, Nobukatsu .
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, :6820-6824
[27]   Low-Light Image Enhancement Using CycleGAN-Based Near-Infrared Image Generation and Fusion [J].
Lee, Min-Han ;
Go, Young-Ho ;
Lee, Seung-Hwan ;
Lee, Sung-Hak .
MATHEMATICS, 2024, 12 (24)
[28]   CycleGAN-based deep learning technique for artifact reduction in fundus photography [J].
Tae Keun Yoo ;
Joon Yul Choi ;
Hong Kyu Kim .
Graefe's Archive for Clinical and Experimental Ophthalmology, 2020, 258 :1631-1637
[29]   SpeechLM: Enhanced Speech Pre-Training With Unpaired Textual Data [J].
Zhang, Ziqiang ;
Chen, Sanyuan ;
Zhou, Long ;
Wu, Yu ;
Ren, Shuo ;
Liu, Shujie ;
Yao, Zhuoyuan ;
Gong, Xun ;
Dai, Lirong ;
Li, Jinyu ;
Wei, Furu .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 :2177-2187
[30]   CycleGAN-Based Clutter Suppression and Pipeline Positioning Method for GPR Image [J].
Wang, Jiachun ;
Lin, Yun ;
Ma, Deyun ;
Wang, Yanping ;
Ye, Shengbo .
IEEE Geoscience and Remote Sensing Letters, 2025, 22