Unsupervised Speaker Adaptation based on the Cosine Similarity for Text-Independent Speaker Verification

被引：0

作者：

Shum, Stephen ^{[1
]}

Dehak, Najim ^{[1
]}

Dehak, Reda ^{[2
]}

Glass, James R. ^{[1
]}

机构：

[1] MIT, Comp Sci & Artificial Intelligence Lab, 32 Vassar St, Cambridge, MA 02139 USA

[2] LRDE, Paris, France

来源：

ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP | 2010年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper proposes a new approach to unsupervised speaker adaptation inspired by the recent success of the factor analysisbased Total Variability Approach to text-independent speaker verification [1]. This approach effectively represents speaker variability in terms of low-dimensional total factor vectors and, when paired alongside the simplicity of cosine similarity scoring, allows for easy manipulation and efficient computation [2]. The development of our adaptation algorithm is motivated by the desire to have a robust method of setting an adaptation threshold, to minimize the amount of required computation for each adaptation update, and to simplify the associated score normalization procedures where possible. To address the final issue, we propose the Symmetric Normalization (S-norm) method, which takes advantage of the symmetry in cosine similarity scoring and achieves competitive performance to that of the ZT-norm while requiring fewer parameter calculations. In subsequent experiments, we also assess an attempt to replace the use of score normalization procedures altogether with a Normalized Cosine Similarity scoring function [3]. We evaluated the performance of our unsupervised speaker adaptation algorithm under various score normalization procedures on the 10sec-10sec and core conditions of the 2008 NIST SRE dataset. Using results without adaptation as our baseline, it was found that the proposed methods are consistent in successfully improving speaker verification performance to achieve state-of-the-art results.

引用

页码：76 / 82

页数：7

共 50 条

[1] CHANNEL ADAPTATION OF PLDA FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
Chen, Liping
Lee, Kong Aik
Ma, Bin
Guo, Wu
Li, Haizhou
Dai, Li Rong
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5251 - 5255
[2] Group-based speaker embeddings for text-independent speaker verification
Jung, Youngmoon
Eom, Youngsik
Lee, Yeonghyeon
Kim, Hoirin
JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (05): : 496 - 502
[3] Cross similarity measurement for speaker adaptive test normalization in text-independent speaker verification
ZHAO Jian
TheJournalofChinaUniversitiesofPostsandTelecommunications, 2008, (02) : 130 - 134
[4] A tutorial on text-independent speaker verification
Bimbot, F. (bimbot@irisa.fr), 1600, Hindawi Publishing Corporation (2004):
[5] A tutorial on text-independent speaker verification
Bimbot, F
Bonastre, JF
Fredouille, C
Gravier, G
Magrin-Chagnolleau, I
Meignier, S
Merlin, T
Ortega-García, J
Petrovska-Delacrétaz, D
Reynolds, DA
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (04) : 430 - 451
[6] Discriminative transformation for sufficient adaptation in text-independent speaker verification
Yang, Hao
Dong, Yuan
Zhao, Xianyu
Zha, Jian
Wang, Haila
CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 558 - +
[7] A Tutorial on Text-Independent Speaker Verification
Frédéric Bimbot
Jean-François Bonastre
Corinne Fredouille
Guillaume Gravier
Ivan Magrin-Chagnolleau
Sylvain Meignier
Teva Merlin
Javier Ortega-García
Dijana Petrovska-Delacrétaz
Douglas A. Reynolds
EURASIP Journal on Advances in Signal Processing, 2004
[8] Triplet Loss Based Cosine Similarity Metric Learning for Text-independent Speaker Recognition
Novoselov, Sergey
Shchemelinin, Vadim
Shulipa, Andrey
Kozlov, Alexandr
Kremnev, Ivan
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2242 - 2246
[9] Triplet Based Embedding Distance and Similarity Learning for Text-independent Speaker Verification
Ren, Zongze
Chen, Zhiyong
Xu, Shugong
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 558 - 562
[10] Deeply Fused Speaker Embeddings for Text-Independent Speaker Verification
Bhattacharya, Gautam
Alam, Jahangir
Gupta, Vishwa
Kenny, Patrick
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3588 - 3592

← 1 2 3 4 5 →