SMALL FOOTPRINT TEXT-INDEPENDENT SPEAKER VERIFICATION FOR EMBEDDED SYSTEMS

被引：11

作者：

Balian, Julien ^{[1
]}

Tavarone, Raffaele ^{[1
]}

Poumeyrol, Mathieu ^{[1
]}

Coucke, Alice ^{[1
]}

机构：

[1] Sonos Inc, Paris, France

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

关键词：

speaker verification; neural networks; text independent; small footprint;

D O I：

10.1109/ICASSP39728.2021.9413564

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep neural network approaches to speaker verification have proven successful, but typical computational requirements of State-Of-The-Art (SOTA) systems make them unsuited for embedded applications. In this work, we present a two-stage model architecture orders of magnitude smaller than common solutions (237.5K learning parameters, 11.5MFLOPS) reaching a competitive result of 3.31% Equal Error Rate (EER) on the well established VoxCeleb1 verification test set. We demonstrate the possibility of running our solution on small devices typical of IoT systems such as the Raspberry Pi 3B with a latency smaller than 200ms on a 5s long utterance. Additionally, we evaluate our model on the acoustically challenging VOiCES corpus. We report a limited increase in EER of 2.6 percentage points with respect to the best scoring model of the 2019 VOiCES from a Distance Challenge, against a reduction of 25.6 times in the number of learning parameters.

引用

页码：6179 / 6183

页数：5

共 50 条

[31] Acoustic Feature Shuffling Network for Text-Independent Speaker Verification
Li, Jin
Fang, Xin
Chu, Fan
Gao, Tian
Song, Yan
Dai, Lirong
INTERSPEECH 2022, 2022, : 4790 - 4794
[32] Robust text-independent speaker verification using genetic programming
Day, Peter
Nandi, Asoke K.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01): : 285 - 295
[33] Self-Attention Networks for Text-Independent Speaker Verification
Bian, Tengyue
Chen, Fangzhou
Xu, Li
PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 3955 - 3960
[34] Deep Neural Network Embeddings for Text-Independent Speaker Verification
Snyder, David
Garcia-Romero, Daniel
Povey, Daniel
Khudanpur, Sanjeev
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 999 - 1003
[35] A Text-Independent Speaker Verification System Based on Cross Entropy
Lu, Xiaochun
Yin, Junxun
COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS, 2009, 51 : 419 - 426
[36] Text-Independent Speaker Verification Based on Information Theoretic Learning
Memon, Sheeraz
Khanzada, Tariq Jameel Saifullah
Bhatti, Sania
MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2011, 30 (03) : 457 - 468
[37] Text-independent speaker verification using utterance level scoring and covariance modeling
Zilca, RD
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (06): : 363 - 370
[38] USEFULNESS OF THE LPC-RESIDUE IN TEXT-INDEPENDENT SPEAKER VERIFICATION
THEVENAZ, P
HUGLI, H
SPEECH COMMUNICATION, 1995, 17 (1-2) : 145 - 157
[39] Improvement of Text-Independent Speaker Verification Using Gender-like Feature
Kiawjak, Pornprom
Wangsiripitak, Somkiat
Pasupa, Kitsuchart
2021 13TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST-2021), 2021, : 219 - 224
[40] Automatic text-independent speaker verification using convolutional deep belief network
Rakhmanenko, I. A.
Shelupanov, A. A.
Kostyuchenko, E. Y.
COMPUTER OPTICS, 2020, 44 (04) : 596 - +

← 1 2 3 4 5 →