Adversarial Reweighting for Speaker Verification Fairness

被引:2
作者
Jin, Minho [1 ]
Ju, Chelsea J-T [2 ]
Chen, Zeya [2 ]
Liu, Yi-Chieh [2 ]
Droppo, Jasha [2 ]
Stolcke, Andreas [2 ]
机构
[1] Amazon Web Serv, Palo Alto, CA 94303 USA
[2] Amazon Alexa AI, Sunnyvale, CA USA
来源
INTERSPEECH 2022 | 2022年
关键词
speaker verification; speaker recognition; fairness; RECOGNITION;
D O I
10.21437/Interspeech.2022-10948
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We address performance fairness for speaker verification using the adversarial reweighting (ARW) method. ARW is reformulated for speaker verification with metric learning, and shown to improve results across different subgroups of gender and nationality, without requiring annotation of subgroups in the training data. An adversarial network learns a weight for each training sample in the batch so that the main learner is forced to focus on poorly performing instances. Using a min-max optimization algorithm, this method improves overall speaker verification fairness. We present three different ARW formulations: accumulated pairwise similarity, pseudo-labeling, and pairwise weighting, and measure their performance in terms of equal error rate (EER) on the VoxCeleb corpus. Results show that the pairwise weighting method can achieve 1.08% overall EER, 1.25% for male and 0.67% for female speakers, with relative EER reductions of 7.7%, 10.1% and 3.0%, respectively. For nationality subgroups, the proposed algorithm showed 1.04% EER for US speakers, 0.76% for UK speakers, and 1.22% for all others. The absolute EER gap between gender groups was reduced from 0.70% to 0.58%, while the standard deviation over nationality groups decreased from 0.21 to 0.19.
引用
收藏
页码:4800 / 4804
页数:5
相关论文
共 30 条
  • [1] LoRAS: an oversampling approach for imbalanced datasets
    Bej, Saptarshi
    Davtyan, Narek
    Wolfien, Markus
    Nassar, Mariam
    Wolkenhauer, Olaf
    [J]. MACHINE LEARNING, 2021, 110 (02) : 279 - 301
  • [2] Caton Simon, 2020, Fairness in machine learning: A survey
  • [3] Chung J.S., 2020, Odyssey: The Speaker and Language Recognition Workshop, P349, DOI 10.21437/Odyssey.2020-49
  • [4] In defence of metric learning for speaker recognition
    Chung, Joon Son
    Huh, Jaesung
    Mun, Seongkyu
    Lee, Minjae
    Heo, Hee-Soo
    Choe, Soyeon
    Ham, Chiheon
    Jung, Sunghwan
    Lee, Bong-Jin
    Han, Icksang
    [J]. INTERSPEECH 2020, 2020, : 2977 - 2981
  • [5] Chung JS, 2018, INTERSPEECH, P1086
  • [6] Fair Voice Biometrics: Impact of Demographic Imbalance on Group Fairness in Speaker Recognition
    Fenu, Gianni
    Marras, Mirko
    Medda, Giacomo
    Meloni, Giacomo
    [J]. INTERSPEECH 2021, 2021, : 1892 - 1896
  • [7] Gupta Maya, 2018, ARXIV180611212
  • [8] Speaker Recognition by Machines and Humans
    Hansen, John H. L.
    Hasan, Taufiq
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2015, 32 (06) : 74 - 99
  • [9] Heo H. S., 2020, ARXIV200914153
  • [10] Hwang S., 2020, P AS C COMP VIS