Joint domain adaptation and speech bandwidth extension using time-domain GANs for speaker verification

被引:1
|
作者
Kataria, Saurabh [1 ,2 ]
Villalba, Jesus [1 ,2 ]
Moro-Velazquez, Laureano [1 ]
Dehak, Najim [1 ,2 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA
来源
INTERSPEECH 2022 | 2022年
关键词
domain adaptation; speech bandwidth extension; time-domain GAN; non-parallel learning; joint learning;
D O I
10.21437/Interspeech.2022-10900
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech systems developed for a particular choice of acoustic domain and sampling frequency do not translate easily to others. The usual practice is to learn domain adaptation and bandwidth extension models independently. Contrary to this, we propose to learn both tasks together. Particularly, we learn to map narrow-band conversational telephone speech to wideband microphone speech. We developed parallel and non-parallel learning solutions which utilize both paired and unpaired data. We first discuss joint and disjoint training of multiple generative models for our tasks. Then, we propose a two-stage learning solution using a pre-trained domain adaptation system for pre-processing in bandwidth extension training. We evaluated our schemes on a Speaker Verification downstream task. We used the JHU-MIT experimental setup for NIST SRE21, which comprises SRE16, SRE-CTS Superset, and SRE21. Our results prove that learning both tasks is better than learning just one. On SRE16, our best system achieves 22% relative improvement in Equal Error Rate w.r.t. a direct learning baseline and 8% w.r.t. a strong bandwidth expansion system.
引用
收藏
页码:615 / 619
页数:5
相关论文
共 44 条
  • [31] An efficient method for time-dependent reliability prediction using domain adaptation
    Zafar, Tayyab
    Wang, Zhonglai
    STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION, 2020, 62 (05) : 2323 - 2340
  • [32] An efficient method for time-dependent reliability prediction using domain adaptation
    Tayyab Zafar
    Zhonglai Wang
    Structural and Multidisciplinary Optimization, 2020, 62 : 2323 - 2340
  • [33] DISENTANGLED SPEAKER AND LANGUAGE REPRESENTATIONS USING MUTUAL INFORMATION MINIMIZATION AND DOMAIN ADAPTATION FOR CROSS-LINGUAL TTS
    Xin, Detai
    Komatsu, Tatsuya
    Takamichi, Shinnosuke
    Saruwatari, Hiroshi
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6608 - 6612
  • [34] DOMAIN ADAPTATION OF DIGITAL PATHOLOGY IMAGES USING JOINT STAIN COLOR AND IMAGE QUALITY CONSTRAINTS
    Long, Xi
    Liu, Jingxin
    Hou, Xianxu
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1805 - 1809
  • [35] IMPROVING OUT-DOMAIN PLDA SPEAKER VERIFICATION USING UNSUPERVISED INTER-DATASET VARIABILITY COMPENSATION APPROACH
    Kanagasundaram, Ahilan
    Dean, David
    Sridharan, Sridha
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4654 - 4658
  • [36] Unsupervised Domain Adaptation for Speech Emotion Recognition using K-Nearest Neighbors Voice Conversion
    Mote, Pravin
    Sisman, Berrak
    Busso, Carlos
    INTERSPEECH 2024, 2024, : 1045 - 1049
  • [37] Cross-corpus speech emotion recognition using semi-supervised domain adaptation network
    Zhang, Yumei
    Jia, Maoshen
    Cao, Xuan
    Ru, Jiawei
    Zhang, Xinfeng
    SPEECH COMMUNICATION, 2025, 168
  • [38] Joint Source-Channel Coding for a Multivariate Gaussian Over a Gaussian MAC Using Variational Domain Adaptation
    Li, Yishen
    Chen, Xuechen
    Deng, Xiaoheng
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2023, 9 (06) : 1424 - 1437
  • [39] Cross-Language Speech Emotion Recognition Using Bag-of-Word Representations, Domain Adaptation, and Data Augmentation
    Kshirsagar, Shruti
    Falk, Tiago H.
    SENSORS, 2022, 22 (17)
  • [40] Unsupervised Domain Adaptation Using Temporal Association for Segmentation and Its Application to C. elegans Time-Lapse Images
    Nozaki, Hiroaki
    Tohsato, Yukako
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT III, 2022, 13531 : 469 - 481