A Multi-Discriminator CycleGAN for Unsupervised Non-Parallel Speech Domain Adaptation

被引:15
|
作者
Hosseini-Asl, Ehsan [1 ]
Zhou, Yingbo [1 ]
Xiong, Caiming [1 ]
Socher, Richard [1 ]
机构
[1] Salesforce Res, San Francisco, CA 94105 USA
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
关键词
generative models; speech domain adaptation; non-parallel data; unsupervised learning; NEURAL-NETWORKS; VOICE;
D O I
10.21437/Interspeech.2018-1535
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Domain adaptation plays an important role for speech recognition models, in particular, for domains that have low resources. We propose a novel generative model based on cyclic consistent generative adversarial network (CycleGAN) for unsupervised non-parallel speech domain adaptation. The proposed model employs multiple independent discriminators on the power spectrogram, each in charge of different frequency bands. As a result we have 1) better discriminators that focus on fine-grained details of the frequency features, and 2) a generator that is capable of generating more realistic domain adapted spectrogram. We demonstrate the effectiveness of our method on speech recognition with gender adaptation, where the model only has access to supervised data from one gender during training, but is evaluated on the other at test time. Our model is able to achieve an average of 7.41% on phoneme error rate, and 11.10% word error rate relative performance improvement as compared to the baseline, on TIMIT and WSJ dataset, respectively. Qualitatively, our model also generates more natural sounding speech, when conditioned on data from the other domain.
引用
收藏
页码:3758 / 3762
页数:5
相关论文
共 50 条
  • [1] Adaptive Weighted Multi-Discriminator CycleGAN for Underwater Image Enhancement
    Park, Jaihyun
    Han, David K.
    Ko, Hanseok
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2019, 7 (07)
  • [2] UDAMA: Unsupervised Domain Adaptation through Multi-discriminator Adversarial Training with Noisy Labels Improves Cardio-fitness Prediction
    Wu, Yu
    Spathis, Dimitris
    Jia, Hong
    Perez-Pozuelo, Ignacio
    Gonzales, Tomas I.
    Brage, Soren
    Wareham, Nicholas
    Mascolo, Cecilia
    MACHINE LEARNING FOR HEALTHCARE CONFERENCE, VOL 219, 2023, 219
  • [3] UMLE: Unsupervised Multi-discriminator Network for Low Light Enhancement
    Qu, Yangyang
    Chen, Kai
    Liu, Chao
    Ou, Yongsheng
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4318 - 4324
  • [4] Speech Intelligibility Enhancement By Non-Parallel Speech Style Conversion Using CWT and iMetricGAN Based CycleGAN
    Xiao, Jing
    Liu, Jiaqi
    Li, Dengshi
    Zhao, Lanxin
    Wang, Qianrui
    MULTIMEDIA MODELING (MMM 2022), PT I, 2022, 13141 : 544 - 556
  • [5] CycleGAN-based Non-parallel Speech Enhancement with an Adaptive Attention-in-attention Mechanism
    Yu, Guochen
    Wang, Yutian
    Zheng, Chengshi
    Wang, Hui
    Zhang, Qin
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 523 - 529
  • [6] Unsupervised Multi-Discriminator Generative Adversarial Network for Lung Nodule Malignancy Classification
    Kuang, Yan
    Lan, Tian
    Peng, Xueqiao
    Selasi, Gati Elvis
    Liu, Qiao
    Zhang, Junyi
    IEEE ACCESS, 2020, 8 (77725-77734) : 77725 - 77734
  • [7] Non-parallel text style transfer with domain adaptation and an attention model
    Mingxuan Hu
    Min He
    Applied Intelligence, 2021, 51 : 4609 - 4622
  • [8] Non-parallel text style transfer with domain adaptation and an attention model
    Hu, Mingxuan
    He, Min
    APPLIED INTELLIGENCE, 2021, 51 (07) : 4609 - 4622
  • [9] Disentangled Discriminator for Unsupervised Domain Adaptation on Object Detection
    Zhu, Yangguang
    Guo, Ping
    Wei, Haoran
    Zhao, Xin
    Wu, Xiangbin
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 5685 - 5691
  • [10] CycleGAN based Unsupervised Domain Adaptation for Machine Fault Diagnosis
    Pattnaik, Naibedya
    Vemula, Uday Sai
    Kumar, Kriti
    Kumar, A. Anil
    Majumdar, Angshul
    Chandra, M. Girish
    Pal, Arpan
    PROCEEDINGS OF THE TWENTIETH ACM CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, SENSYS 2022, 2022, : 973 - 979