LEARNABLE NONLINEAR COMPRESSION FOR ROBUST SPEAKER VERIFICATION

被引：2

作者：

Liu, Xuechen ^{[1
,2
]}

Sahidullah, Md ^{[2
]}

Kinnunen, Tomi ^{[1
]}

机构：

[1] Univ Eastern Finland, Sch Comp, Joensuu, Finland

[2] Univ Lorraine, INRIA, CNRS, LORIA, F-54000 Nancy, France

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年

关键词：

Speaker Verification; Nonlinear Compression; Multi-Regime Compression; RECOGNITION;

D O I：

10.1109/ICASSP43922.2022.9747185

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this study, we focus on nonlinear compression methods in spectral features for speaker verification based on deep neural network. We consider different kinds of channel-dependent (CD) nonlinear compression methods optimized in a data-driven manner. Our methods are based on power nonlinearities and dynamic range compression (DRC). We also propose multi-regime (MR) design on the nonlinearities, at improving robustness. Results on VoxCeleb1 and VoxMovies data demonstrate improvements brought by proposed compression methods over both the commonly-used logarithm and their static counterparts, especially for ones based on power function. While CD generalization improves performance on VoxCeleb1, MR provides more robustness on VoxMovies, with a maximum relative equal error rate reduction of 21.6%.

引用

页码：7962 / 7966

页数：5

共 50 条

[31] Robust Training for Speaker Verification against Noisy Labels
Fang, Zhihua
He, Liang
Ma, Hanhan
Guo, Xiaochen
Li, Lin
INTERSPEECH 2023, 2023, : 3192 - 3196
[32] A speaker verification backend with robust performance across conditions
Ferrer, Luciana
McLaren, Mitchell
Brummer, Niko
COMPUTER SPEECH AND LANGUAGE, 2022, 71
[33] Noise Robust Speaker Verification with Delta Cepstrum Normalization
Kanda, Naoyuki
Takeda, Ryu
Obuchi, Yasunari
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3111 - 3115
[34] Contrastive Learning and Inter-Speaker Distribution Alignment Based Unsupervised Domain Adaptation for Robust Speaker Verification
Li, Zuoliang
Guo, Wu
Bin Gu
Peng, Shengyu
Zhang, Jie
INTERSPEECH 2024, 2024, : 3794 - 3798
[35] SNR-Invariant PLDA Modeling in Nonparametric Subspace for Robust Speaker Verification
Li, Na
Mak, Man-Wai
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (10) : 1648 - 1659
[36] Multitaper chirp group delay Hilbert envelope coefficients for robust speaker verification
Krobba, Ahmed
Debyeche, Mohamed
Selouani, Sid-Ahmed
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (14) : 19525 - 19542
[37] OPTIMIZED POWER NORMALIZED CEPSTRAL COEFFICIENTS TOWARDS ROBUST DEEP SPEAKER VERIFICATION
Liu, Xuechen
Sahidullah, Md
Kinnunen, Tomi
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 185 - 190
[38] Maximum Likelihood Acoustic Factor Analysis Models for Robust Speaker Verification in Noise
Hasan, Taufiq
Hansen, John H. L.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (02) : 381 - 391
[39] MODELLING SPEAKER AND CHANNEL VARIABILITY USING DEEP NEURAL NETWORKS FOR ROBUST SPEAKER VERIFICATION
Bhattacharya, Gautam
Alam, Jahangir
Kenny, Patrick
Gupta, Vishwa
2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 192 - 198
[40] Robust Speaker Verification using GFCC and Joint Factor Analysis
Das, Pranab
Bhattacharjee, Utpal
2014 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT, 2014,

← 1 2 3 4 5 →