Multi-class AUC Optimization for Robust Small-footprint Keyword Spotting with Limited Training Data

被引:0
|
作者
Xu, Menglong [1 ]
Li, Shengqiang [1 ]
Liang, Chengdong [1 ]
Zhang, Xiao-Lei [1 ]
机构
[1] Northwestern Polytech Univ, Sch Marine Sci & Technol, CIAIC, Xian, Peoples R China
来源
INTERSPEECH 2022 | 2022年
基金
美国国家科学基金会;
关键词
keyword spotting; multi-class AUC optimization;
D O I
10.21437/Interspeech.2022-11356
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep neural networks provide effective solutions to small-footprint keyword spotting (KWS). However, most of the KWS methods take softmax with the minimum cross-entropy as the loss function, which focuses only on maximizing the classification accuracy on the training set, without taking unseen sounds that are out of the training data into account. If training data is limited, it remains challenging to achieve robust and highly accurate KWS in real-world scenarios where the unseen sounds are frequently encountered. In this paper, we propose a new KWS method, which consists of a novel loss function, named the maximization of the area under the receiver-operating-characteristic curve (AUC), and a confidence-based decision method. The proposed KWS method not only maintains high keywords classification accuracy, but is also robust to the unseen sounds. Experimental results on the Google Speech Commands dataset v1 and v2 show that our method achieves state-of-the-art performance in terms of most evaluation metrics.
引用
收藏
页码:3278 / 3282
页数:5
相关论文
共 37 条
  • [21] STREAMING SMALL-FOOTPRINT KEYWORD SPOTTING USING SEQUENCE-TO-SEQUENCE MODELS
    He, Yanzhang
    Prabhavalkar, Rohit
    Rao, Kanishka
    Li, Wei
    Bakhtin, Anton
    McGraw, Ian
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 474 - 481
  • [22] Small-footprint Spiking Neural Networks for Power-efficient Keyword Spotting
    Pedroni, Bruno U.
    Sheik, Sadique
    Mostafa, Hesham
    Paul, Somnath
    Augustine, Charles
    Cauwenberghs, Gert
    2018 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS): ADVANCED SYSTEMS FOR ENHANCING HUMAN HEALTH, 2018, : 591 - 594
  • [23] Error-Diffusion Based Speech Feature Quantization for Small-Footprint Keyword Spotting
    Luo, Mengjie
    Wang, Dingyi
    Wang, Xiaoqin
    Qiao, Shushan
    Zhou, Yumei
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1357 - 1361
  • [24] Reduced Model Size Deep Convolutional Neural Networks for Small-Footprint Keyword Spotting
    Tsai, Tsung Han
    Lin, Xin Hui
    2021 28TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS, AND SYSTEMS (IEEE ICECS 2021), 2021,
  • [25] A Configurable Accelerator for Keyword Spotting Based on Small-Footprint Temporal Efficient Neural Network
    He, Keyan
    Chen, Dihu
    Su, Tao
    ELECTRONICS, 2022, 11 (16)
  • [26] MAX-POOLING LOSS TRAINING OF LONG SHORT-TERM MEMORY NETWORKS FOR SMALL-FOOTPRINT KEYWORD SPOTTING
    Sun, Ming
    Raju, Anirudh
    Tucker, George
    Panchapagesan, Sankaran
    Fu, Gengshen
    Mandal, Arindam
    Matsoukas, Spyros
    Strom, Nikko
    Vitaladevuni, Shiv
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 474 - 480
  • [27] Small-Footprint Keyword Spotting for Controlling Smart Home Appliances Using TCN and CRNN Models
    Alapati, Hemalatha
    Paolini, Christopher
    Chinara, Suchismita
    Sarkar, Mahasweta
    INTERNATIONAL JOURNAL OF INTERDISCIPLINARY TELECOMMUNICATIONS AND NETWORKING, 2022, 14 (01)
  • [28] Depthwise Separable Convolutional ResNet with Squeeze-and-Excitation Blocks for Small-footprint Keyword Spotting
    Xu, Menglong
    Zhang, Xiao-Lei
    INTERSPEECH 2020, 2020, : 2547 - 2551
  • [29] Small-Footprint Keyword Spotting Based on Gated Channel Transformation Sandglass Residual Neural Network
    Zhang, Ying
    Zhu, Shirong
    Yu, Chao
    Zhao, Lasheng
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (07)
  • [30] An empirical study of cross-lingual transfer learning techniques for small-footprint keyword spotting
    Sun, Ming
    Schwarz, Andreas
    Wu, Minhua
    Strom, Nikko
    Matsoukas, Spyros
    Vitaladevuni, Shiv
    2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 255 - 260