SMALL FOOTPRINT TEXT-INDEPENDENT SPEAKER VERIFICATION FOR EMBEDDED SYSTEMS

被引:11
|
作者
Balian, Julien [1 ]
Tavarone, Raffaele [1 ]
Poumeyrol, Mathieu [1 ]
Coucke, Alice [1 ]
机构
[1] Sonos Inc, Paris, France
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年
关键词
speaker verification; neural networks; text independent; small footprint;
D O I
10.1109/ICASSP39728.2021.9413564
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep neural network approaches to speaker verification have proven successful, but typical computational requirements of State-Of-The-Art (SOTA) systems make them unsuited for embedded applications. In this work, we present a two-stage model architecture orders of magnitude smaller than common solutions (237.5K learning parameters, 11.5MFLOPS) reaching a competitive result of 3.31% Equal Error Rate (EER) on the well established VoxCeleb1 verification test set. We demonstrate the possibility of running our solution on small devices typical of IoT systems such as the Raspberry Pi 3B with a latency smaller than 200ms on a 5s long utterance. Additionally, we evaluate our model on the acoustically challenging VOiCES corpus. We report a limited increase in EER of 2.6 percentage points with respect to the best scoring model of the 2019 VOiCES from a Distance Challenge, against a reduction of 25.6 times in the number of learning parameters.
引用
收藏
页码:6179 / 6183
页数:5
相关论文
共 50 条
  • [31] Acoustic Feature Shuffling Network for Text-Independent Speaker Verification
    Li, Jin
    Fang, Xin
    Chu, Fan
    Gao, Tian
    Song, Yan
    Dai, Lirong
    INTERSPEECH 2022, 2022, : 4790 - 4794
  • [32] Robust text-independent speaker verification using genetic programming
    Day, Peter
    Nandi, Asoke K.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01): : 285 - 295
  • [33] Self-Attention Networks for Text-Independent Speaker Verification
    Bian, Tengyue
    Chen, Fangzhou
    Xu, Li
    PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 3955 - 3960
  • [34] Deep Neural Network Embeddings for Text-Independent Speaker Verification
    Snyder, David
    Garcia-Romero, Daniel
    Povey, Daniel
    Khudanpur, Sanjeev
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 999 - 1003
  • [35] A Text-Independent Speaker Verification System Based on Cross Entropy
    Lu, Xiaochun
    Yin, Junxun
    COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS, 2009, 51 : 419 - 426
  • [36] Text-Independent Speaker Verification Based on Information Theoretic Learning
    Memon, Sheeraz
    Khanzada, Tariq Jameel Saifullah
    Bhatti, Sania
    MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2011, 30 (03) : 457 - 468
  • [37] Text-independent speaker verification using utterance level scoring and covariance modeling
    Zilca, RD
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (06): : 363 - 370
  • [38] USEFULNESS OF THE LPC-RESIDUE IN TEXT-INDEPENDENT SPEAKER VERIFICATION
    THEVENAZ, P
    HUGLI, H
    SPEECH COMMUNICATION, 1995, 17 (1-2) : 145 - 157
  • [39] Improvement of Text-Independent Speaker Verification Using Gender-like Feature
    Kiawjak, Pornprom
    Wangsiripitak, Somkiat
    Pasupa, Kitsuchart
    2021 13TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SMART TECHNOLOGY (KST-2021), 2021, : 219 - 224
  • [40] Automatic text-independent speaker verification using convolutional deep belief network
    Rakhmanenko, I. A.
    Shelupanov, A. A.
    Kostyuchenko, E. Y.
    COMPUTER OPTICS, 2020, 44 (04) : 596 - +