Joint domain adaptation and speech bandwidth extension using time-domain GANs for speaker verification

被引：1

作者：

Kataria, Saurabh ^{[1
,2
]}

Villalba, Jesus ^{[1
,2
]}

Moro-Velazquez, Laureano ^{[1
]}

Dehak, Najim ^{[1
,2
]}

机构：

[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA

[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA

来源：

INTERSPEECH 2022 | 2022年

关键词：

domain adaptation; speech bandwidth extension; time-domain GAN; non-parallel learning; joint learning;

D O I：

10.21437/Interspeech.2022-10900

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speech systems developed for a particular choice of acoustic domain and sampling frequency do not translate easily to others. The usual practice is to learn domain adaptation and bandwidth extension models independently. Contrary to this, we propose to learn both tasks together. Particularly, we learn to map narrow-band conversational telephone speech to wideband microphone speech. We developed parallel and non-parallel learning solutions which utilize both paired and unpaired data. We first discuss joint and disjoint training of multiple generative models for our tasks. Then, we propose a two-stage learning solution using a pre-trained domain adaptation system for pre-processing in bandwidth extension training. We evaluated our schemes on a Speaker Verification downstream task. We used the JHU-MIT experimental setup for NIST SRE21, which comprises SRE16, SRE-CTS Superset, and SRE21. Our results prove that learning both tasks is better than learning just one. On SRE16, our best system achieves 22% relative improvement in Equal Error Rate w.r.t. a direct learning baseline and 8% w.r.t. a strong bandwidth expansion system.

引用

页码：615 / 619

页数：5

共 44 条

[31] An efficient method for time-dependent reliability prediction using domain adaptation
Zafar, Tayyab
Wang, Zhonglai
STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION, 2020, 62 (05) : 2323 - 2340
[32] An efficient method for time-dependent reliability prediction using domain adaptation
Tayyab Zafar
Zhonglai Wang
Structural and Multidisciplinary Optimization, 2020, 62 : 2323 - 2340
[33] DISENTANGLED SPEAKER AND LANGUAGE REPRESENTATIONS USING MUTUAL INFORMATION MINIMIZATION AND DOMAIN ADAPTATION FOR CROSS-LINGUAL TTS
Xin, Detai
Komatsu, Tatsuya
Takamichi, Shinnosuke
Saruwatari, Hiroshi
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6608 - 6612
[34] DOMAIN ADAPTATION OF DIGITAL PATHOLOGY IMAGES USING JOINT STAIN COLOR AND IMAGE QUALITY CONSTRAINTS
Long, Xi
Liu, Jingxin
Hou, Xianxu
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1805 - 1809
[35] IMPROVING OUT-DOMAIN PLDA SPEAKER VERIFICATION USING UNSUPERVISED INTER-DATASET VARIABILITY COMPENSATION APPROACH
Kanagasundaram, Ahilan
Dean, David
Sridharan, Sridha
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4654 - 4658
[36] Unsupervised Domain Adaptation for Speech Emotion Recognition using K-Nearest Neighbors Voice Conversion
Mote, Pravin
Sisman, Berrak
Busso, Carlos
INTERSPEECH 2024, 2024, : 1045 - 1049
[37] Cross-corpus speech emotion recognition using semi-supervised domain adaptation network
Zhang, Yumei
Jia, Maoshen
Cao, Xuan
Ru, Jiawei
Zhang, Xinfeng
SPEECH COMMUNICATION, 2025, 168
[38] Joint Source-Channel Coding for a Multivariate Gaussian Over a Gaussian MAC Using Variational Domain Adaptation
Li, Yishen
Chen, Xuechen
Deng, Xiaoheng
IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2023, 9 (06) : 1424 - 1437
[39] Cross-Language Speech Emotion Recognition Using Bag-of-Word Representations, Domain Adaptation, and Data Augmentation
Kshirsagar, Shruti
Falk, Tiago H.
SENSORS, 2022, 22 (17)
[40] Unsupervised Domain Adaptation Using Temporal Association for Segmentation and Its Application to C. elegans Time-Lapse Images
Nozaki, Hiroaki
Tohsato, Yukako
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT III, 2022, 13531 : 469 - 481

← 1 2 3 4 5 →