SMALL FOOTPRINT TEXT-INDEPENDENT SPEAKER VERIFICATION FOR EMBEDDED SYSTEMS

被引：11

作者：

Balian, Julien ^{[1
]}

Tavarone, Raffaele ^{[1
]}

Poumeyrol, Mathieu ^{[1
]}

Coucke, Alice ^{[1
]}

机构：

[1] Sonos Inc, Paris, France

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

关键词：

speaker verification; neural networks; text independent; small footprint;

D O I：

10.1109/ICASSP39728.2021.9413564

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep neural network approaches to speaker verification have proven successful, but typical computational requirements of State-Of-The-Art (SOTA) systems make them unsuited for embedded applications. In this work, we present a two-stage model architecture orders of magnitude smaller than common solutions (237.5K learning parameters, 11.5MFLOPS) reaching a competitive result of 3.31% Equal Error Rate (EER) on the well established VoxCeleb1 verification test set. We demonstrate the possibility of running our solution on small devices typical of IoT systems such as the Raspberry Pi 3B with a latency smaller than 200ms on a 5s long utterance. Additionally, we evaluate our model on the acoustically challenging VOiCES corpus. We report a limited increase in EER of 2.6 percentage points with respect to the best scoring model of the 2019 VOiCES from a Distance Challenge, against a reduction of 25.6 times in the number of learning parameters.

引用

页码：6179 / 6183

页数：5

共 50 条

[41] EFFECTS OF GENDER INFORMATION IN TEXT-INDEPENDENT AND TEXT-DEPENDENT SPEAKER VERIFICATION
Kanervisto, Anssi
Vestman, Ville
Sahidullah, Md
Hautamaki, Ville
Kinnunen, Tomi
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5360 - 5364
[42] Text-Independent Speaker Verification Using Rank Threshold in Large Number of Speaker Models
Okamoto, Haruka
Tsuge, Satoru
Abdelwahab, Amira
Nishida, Masafumi
Horiuchi, Yasuo
Kuroiwa, Shingo
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2319 - +
[43] CONTRASTIVE SELF-SUPERVISED LEARNING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
Zhang, Haoran
Zou, Yuexian
Wang, Helin
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6713 - 6717
[44] Pseudo-Phoneme Label Loss for Text-Independent Speaker Verification
Niu, Mengqi
He, Liang
Fang, Zhihua
Zhao, Baowei
Wang, Kai
APPLIED SCIENCES-BASEL, 2022, 12 (15):
[45] Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models
Chakroun, Rania
Frikha, Mondher
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT II, 2019, 11776 : 3 - 10
[46] A ROBUST TEXT-INDEPENDENT SPEAKER VERIFICATION METHOD BASED ON SPEECH SEPARATION AND DEEP SPEAKER
Zhao, Fei
Li, Hao
Zhang, Xueliang
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6101 - 6105
[47] Cross similarity measurement for speaker adaptive test normalization in text-independent speaker verification
ZHAO Jian
The Journal of China Universities of Posts and Telecommunications, 2008, (02) : 130 - 134
[48] Deep multi-metric learning for text-independent speaker verification
Xu, Jiwei
Wang, Xinggang
Feng, Bin
Liu, Wenyu
NEUROCOMPUTING, 2020, 410 : 394 - 400
[49] Integrated Replay Spoofing-Aware Text-Independent Speaker Verification
Shim, Hye-jin
Jung, Jee-weon
Kim, Ju-ho
Yu, Ha-jin
APPLIED SCIENCES-BASEL, 2020, 10 (18):
[50] Text-Independent Speaker Verification Using Artificially Generated GMMs for Cohorts
Mukai, Yuuji
Noda, Hideki
Nimi, Michiharu
Osanai, Takashi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (10) : 2536 - 2539

← 1 2 3 4 5 →