Joint domain adaptation and speech bandwidth extension using time-domain GANs for speaker verification

被引:1
|
作者
Kataria, Saurabh [1 ,2 ]
Villalba, Jesus [1 ,2 ]
Moro-Velazquez, Laureano [1 ]
Dehak, Najim [1 ,2 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA
来源
INTERSPEECH 2022 | 2022年
关键词
domain adaptation; speech bandwidth extension; time-domain GAN; non-parallel learning; joint learning;
D O I
10.21437/Interspeech.2022-10900
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech systems developed for a particular choice of acoustic domain and sampling frequency do not translate easily to others. The usual practice is to learn domain adaptation and bandwidth extension models independently. Contrary to this, we propose to learn both tasks together. Particularly, we learn to map narrow-band conversational telephone speech to wideband microphone speech. We developed parallel and non-parallel learning solutions which utilize both paired and unpaired data. We first discuss joint and disjoint training of multiple generative models for our tasks. Then, we propose a two-stage learning solution using a pre-trained domain adaptation system for pre-processing in bandwidth extension training. We evaluated our schemes on a Speaker Verification downstream task. We used the JHU-MIT experimental setup for NIST SRE21, which comprises SRE16, SRE-CTS Superset, and SRE21. Our results prove that learning both tasks is better than learning just one. On SRE16, our best system achieves 22% relative improvement in Equal Error Rate w.r.t. a direct learning baseline and 8% w.r.t. a strong bandwidth expansion system.
引用
收藏
页码:615 / 619
页数:5
相关论文
共 44 条
  • [21] Aggregating discriminative embedding by triple-domain feature joint learning with bidirectional sampling for speaker verification
    Zi, Yunfei
    Xiong, Shengwu
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 83
  • [22] Robust Tool Wear Prediction using Multi-Sensor Fusion and Time-Domain Features for the Milling Process using Instance-based Domain Adaptation
    Warke, Vivek
    Kumar, Satish
    Bongale, Arunkumar
    Kotecha, Ketan
    KNOWLEDGE-BASED SYSTEMS, 2024, 288
  • [23] TDASS: Target Domain Adaptation Speech Synthesis Framework for Multi-speaker Low-Resource TTS
    Zhang, Xulong
    Wang, Jianzong
    Cheng, Ning
    Xiao, Jing
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [24] Domain Adaptation Using Factorized Hidden Layer for Robust Automatic Speech Recognition
    Sim, Khe Chai
    Narayanan, Arun
    Misra, Ananya
    Tripathi, Anshuman
    Pundak, Golan
    Sainath, Tara N.
    Haghani, Parisa
    Li, Bo
    Bacchiani, Michiel
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 892 - 896
  • [25] Robust Speech Recognition Using Teacher-Student Learning Domain Adaptation
    Ma, Han
    Zhang, Qiaoling
    Tang, Roubing
    Zhang, Lu
    Jia, Yubo
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (12) : 2112 - 2118
  • [26] Cross-lingual Text-To-Speech Synthesis via Domain Adaptation and Perceptual Similarity Regression in Speaker Space
    Xin, Detai
    Saito, Yuki
    Takamichi, Shinnosuke
    Koriyama, Tomoki
    Saruwatari, Hiroshi
    INTERSPEECH 2020, 2020, : 2947 - 2951
  • [27] When Whisper Meets TTS: Domain Adaptation Using only Synthetic Speech Data
    Vasquez-Correal, Juan Camilo
    Arzelus, Haritz
    Martin-Donas, Juan M.
    Arellano, Joaquin
    Gonzalez-Docasal, Ander
    Alvarez, Aitor
    TEXT, SPEECH, AND DIALOGUE, TSD 2023, 2023, 14102 : 226 - 238
  • [28] Domain Adaptation for Part-of-Speech Tagging of Indonesian Text Using Affix Information
    Maulana, Aditya
    Romadhony, Ade
    5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL INTELLIGENCE 2020, 2021, 179 : 640 - 647
  • [29] Improved Cross-Corpus Speech Emotion Recognition Using Deep Local Domain Adaptation
    Zhao Huijuan
    Ye Ning
    Wang Ruchuan
    CHINESE JOURNAL OF ELECTRONICS, 2023, 32 (03) : 640 - 646
  • [30] Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer
    Huang, Lu
    Li, Boyu
    Zhang, Jun
    Lu, Lu
    Ma, Zejun
    INTERSPEECH 2023, 2023, : 386 - 390