Jointing Multi-task Learning and Gradient Reversal Layer for Far-Field Speaker Verification

被引:1
|
作者
Xu, Wei [1 ]
Wang, Xinghao [1 ]
Wan, Hao [1 ,2 ]
Guo, Xin [3 ]
Zhao, Junhong [1 ]
Deng, Feiqi [1 ]
Kang, Wenxiong [1 ]
机构
[1] South China Univ Technol, Sch Automat Sci & Engn, Guangzhou 510641, Peoples R China
[2] Guangdong Baiyun Airport Informat Technol Co Ltd, Postdoctoral Innovat Base, Guangzhou, Peoples R China
[3] Guangdong Commun Polytech, Guangzhou, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Far-field speaker verification; Multi-task learning; Gradient reversal layer; Dynamic loss weights strategy;
D O I
10.1007/978-3-030-86608-2_49
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Far-field speaker verification is challenging, because of interferences caused by different distances between the speaker and the recorder. In this paper, a distance discriminator, which determines whether two utterances are recorded at the same distance, is used as an auxiliary task to learn distance discrepancy information. There are two identical auxiliary tasks, one is added before the speaker embedding layer to learn distance discrepancy information via multi-task learning, and then the other is added after that layer to suppress the learned discrepancy via a gradient reversal layer. In addition, to avoid conflicts among the optimization directions of all tasks, the loss weight of every task is updated dynamically during training. Experiments on AISHELL Wake-up show a relatively 7% and 10.3% reduction of equal error rate (EER) on far-far speaker verification and near-far speaker verification respectively, compared with the single-task model, demonstrating the effectiveness of the proposed method.
引用
收藏
页码:449 / 457
页数:9
相关论文
共 50 条
  • [21] EMBEDDING AGGREGATION FOR FAR-FIELD SPEAKER VERIFICATION WITH DISTRIBUTED MICROPHONE ARRAYS
    Cai, Danwei
    Li, Ming
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 308 - 315
  • [22] Towards multi-task learning of speech and speaker recognition
    Vaessen, Nik
    van Leeuwen, David A.
    INTERSPEECH 2023, 2023, : 4898 - 4902
  • [23] Multi-task Learning-Based Spoofing-Robust Automatic Speaker Verification System
    Yuanjun Zhao
    Roberto Togneri
    Victor Sreeram
    Circuits, Systems, and Signal Processing, 2022, 41 : 4068 - 4089
  • [24] Multi-task learning of deep neural networks for joint automatic speaker verification and spoofing detection
    Li, Jiakang
    Sun, Meng
    Zhang, Xiongwei
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1517 - 1522
  • [25] Utilization of age information for speaker verification using multi-task learning deep neural networks
    Kim, Ju-ho
    Heo, Hee-Soo
    Jung, Jee-weon
    Shim, Hye-jin
    Kim, Seung-Bin
    Yu, Ha-Jin
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2019, 38 (05): : 593 - 600
  • [26] Multi-task Learning-Based Spoofing-Robust Automatic Speaker Verification System
    Zhao, Yuanjun
    Togneri, Roberto
    Sreeram, Victor
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (07) : 4068 - 4089
  • [27] STC-innovation Speaker Recognition Systems for Far-Field Speaker Verification Challenge 2020
    Gusev, Aleksei
    Volokhov, Vladimir
    Vinogradova, Alisa
    Andzhukaev, Tseren
    Shulipa, Andrey
    Novoselov, Sergey
    Pekhovsky, Timur
    Kozlov, Alexander
    INTERSPEECH 2020, 2020, : 3466 - 3470
  • [28] Detecting Replay Attacks from Far-Field Recordings on Speaker Verification Systems
    Villalba, Jesus
    Lleida, Eduardo
    BIOMETRICS AND ID MANAGEMENT, 2011, 6583 : 274 - 285
  • [29] Joint Decision of Anti-Spoofing and Automatic Speaker Verification by Multi-Task Learning With Contrastive Loss
    Li, Jiakang
    Sun, Meng
    Zhang, Xiongwei
    Wang, Yimin
    IEEE ACCESS, 2020, 8 : 7907 - 7915
  • [30] Replay spoofing detection system for automatic speaker verification using multi-task learning of noise classes
    Shim, Hye-Jin
    Jung, Jee-Weon
    Heo, Hee-Soo
    Yoon, Sung-Hyun
    Yu, Ha-Jin
    2018 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2018, : 172 - 176