SMALL FOOTPRINT TEXT-INDEPENDENT SPEAKER VERIFICATION FOR EMBEDDED SYSTEMS

被引：11

作者：

Balian, Julien ^{[1
]}

Tavarone, Raffaele ^{[1
]}

Poumeyrol, Mathieu ^{[1
]}

Coucke, Alice ^{[1
]}

机构：

[1] Sonos Inc, Paris, France

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

关键词：

speaker verification; neural networks; text independent; small footprint;

D O I：

10.1109/ICASSP39728.2021.9413564

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep neural network approaches to speaker verification have proven successful, but typical computational requirements of State-Of-The-Art (SOTA) systems make them unsuited for embedded applications. In this work, we present a two-stage model architecture orders of magnitude smaller than common solutions (237.5K learning parameters, 11.5MFLOPS) reaching a competitive result of 3.31% Equal Error Rate (EER) on the well established VoxCeleb1 verification test set. We demonstrate the possibility of running our solution on small devices typical of IoT systems such as the Raspberry Pi 3B with a latency smaller than 200ms on a 5s long utterance. Additionally, we evaluate our model on the acoustically challenging VOiCES corpus. We report a limited increase in EER of 2.6 percentage points with respect to the best scoring model of the 2019 VOiCES from a Distance Challenge, against a reduction of 25.6 times in the number of learning parameters.

引用

页码：6179 / 6183

页数：5

共 50 条

[1] PROTOTYPICAL NETWORKS FOR SMALL FOOTPRINT TEXT-INDEPENDENT SPEAKER VERIFICATION
Ko, Tom
Chen, Yangbin
Li, Qing
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6804 - 6808
[2] Text-independent speaker verification in embedded environments
Tydlitat, Borivoj
Navratil, Jiri
Pelecanos, Jason W.
Ramaswamy, Ganesh N.
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 293 - +
[3] A tutorial on text-independent speaker verification
Bimbot, F
Bonastre, JF
Fredouille, C
Gravier, G
Magrin-Chagnolleau, I
Meignier, S
Merlin, T
Ortega-García, J
Petrovska-Delacrétaz, D
Reynolds, DA
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (04) : 430 - 451
[4] A Tutorial on Text-Independent Speaker Verification
Frédéric Bimbot
Jean-François Bonastre
Corinne Fredouille
Guillaume Gravier
Ivan Magrin-Chagnolleau
Sylvain Meignier
Teva Merlin
Javier Ortega-García
Dijana Petrovska-Delacrétaz
Douglas A. Reynolds
EURASIP Journal on Advances in Signal Processing, 2004
[5] Score normalization for text-independent speaker verification systems
Auckenthaler, R
Carey, M
Lloyd-Thomas, H
DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) : 42 - 54
[6] Significance of Constraining Text in Limited Data Text-independent Speaker Verification
Das, Rohan Kumar
Jelil, Sarfaraz
Prasanna, S. R. Mahadeva
2016 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2016,
[7] Deep Speaker Feature Learning for Text-independent Speaker Verification
Li, Lantian
Chen, Yixiang
Shi, Zing
Tang, Zhiyuan
Wang, Dong
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1542 - 1546
[8] Maximum Likelihood Discriminant Feature for Text-Independent Speaker Verification
Liu, Qingsong
Dai, Beiqian
PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOLS 1-9, 2009, : 3733 - 3736
[9] Text-independent speaker verification using predictive neural networks
Finan, RA
Sapeluk, AT
Damper, RI
FIFTH INTERNATIONAL CONFERENCE ON ARTIFICIAL NEURAL NETWORKS, 1997, (440): : 274 - 279
[10] SpeakerNet for Cross-lingual Text-Independent Speaker Verification
Habib, Hafsa
Tauseef, Huma
Fahiem, Muhammad Abuzar
Farhan, Saima
Usman, Ghousia
ARCHIVES OF ACOUSTICS, 2020, 45 (04) : 573 - 583

← 1 2 3 4 5 →