LEARNABLE NONLINEAR COMPRESSION FOR ROBUST SPEAKER VERIFICATION

被引:2
|
作者
Liu, Xuechen [1 ,2 ]
Sahidullah, Md [2 ]
Kinnunen, Tomi [1 ]
机构
[1] Univ Eastern Finland, Sch Comp, Joensuu, Finland
[2] Univ Lorraine, INRIA, CNRS, LORIA, F-54000 Nancy, France
来源
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年
关键词
Speaker Verification; Nonlinear Compression; Multi-Regime Compression; RECOGNITION;
D O I
10.1109/ICASSP43922.2022.9747185
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this study, we focus on nonlinear compression methods in spectral features for speaker verification based on deep neural network. We consider different kinds of channel-dependent (CD) nonlinear compression methods optimized in a data-driven manner. Our methods are based on power nonlinearities and dynamic range compression (DRC). We also propose multi-regime (MR) design on the nonlinearities, at improving robustness. Results on VoxCeleb1 and VoxMovies data demonstrate improvements brought by proposed compression methods over both the commonly-used logarithm and their static counterparts, especially for ones based on power function. While CD generalization improves performance on VoxCeleb1, MR provides more robustness on VoxMovies, with a maximum relative equal error rate reduction of 21.6%.
引用
收藏
页码:7962 / 7966
页数:5
相关论文
共 50 条
  • [31] Robust Training for Speaker Verification against Noisy Labels
    Fang, Zhihua
    He, Liang
    Ma, Hanhan
    Guo, Xiaochen
    Li, Lin
    INTERSPEECH 2023, 2023, : 3192 - 3196
  • [32] A speaker verification backend with robust performance across conditions
    Ferrer, Luciana
    McLaren, Mitchell
    Brummer, Niko
    COMPUTER SPEECH AND LANGUAGE, 2022, 71
  • [33] Noise Robust Speaker Verification with Delta Cepstrum Normalization
    Kanda, Naoyuki
    Takeda, Ryu
    Obuchi, Yasunari
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3111 - 3115
  • [34] Contrastive Learning and Inter-Speaker Distribution Alignment Based Unsupervised Domain Adaptation for Robust Speaker Verification
    Li, Zuoliang
    Guo, Wu
    Bin Gu
    Peng, Shengyu
    Zhang, Jie
    INTERSPEECH 2024, 2024, : 3794 - 3798
  • [35] SNR-Invariant PLDA Modeling in Nonparametric Subspace for Robust Speaker Verification
    Li, Na
    Mak, Man-Wai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (10) : 1648 - 1659
  • [36] Multitaper chirp group delay Hilbert envelope coefficients for robust speaker verification
    Krobba, Ahmed
    Debyeche, Mohamed
    Selouani, Sid-Ahmed
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (14) : 19525 - 19542
  • [37] OPTIMIZED POWER NORMALIZED CEPSTRAL COEFFICIENTS TOWARDS ROBUST DEEP SPEAKER VERIFICATION
    Liu, Xuechen
    Sahidullah, Md
    Kinnunen, Tomi
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 185 - 190
  • [38] Maximum Likelihood Acoustic Factor Analysis Models for Robust Speaker Verification in Noise
    Hasan, Taufiq
    Hansen, John H. L.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (02) : 381 - 391
  • [39] MODELLING SPEAKER AND CHANNEL VARIABILITY USING DEEP NEURAL NETWORKS FOR ROBUST SPEAKER VERIFICATION
    Bhattacharya, Gautam
    Alam, Jahangir
    Kenny, Patrick
    Gupta, Vishwa
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 192 - 198
  • [40] Robust Speaker Verification using GFCC and Joint Factor Analysis
    Das, Pranab
    Bhattacharjee, Utpal
    2014 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT, 2014,