Small Footprint Multi-channel Network for Keyword Spotting with Centroid Based Awareness

被引:3
|
作者
Ng, Dianwen [1 ,2 ]
Xiao, Yang [2 ]
Yip, Jia Qi [1 ,2 ]
Yang, Zhao [2 ]
Tian, Biao [1 ]
Fu, Qiang [1 ]
Chng, Eng Siong [2 ]
Ma, Bin [1 ]
机构
[1] Alibaba Grp, Speech Lab DAMO Acad, Hangzhou, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
来源
INTERSPEECH 2023 | 2023年
关键词
Small Footprint; Keyword Spotting; Multichannel; Noisy Far-field; Centroid Awareness;
D O I
10.21437/Interspeech.2023-1210
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Spoken Keyword Spotting (KWS) in noisy far-field environments is challenging for small-footprint models, given the restrictions on computational resources (e.g., model size, running memory). This is even more intricate when handling noises from multiple microphones. To address this, we present a new multi-channel model that uses a CNN-based network with a linear mixing unit to achieve local-global dependency representations. Our method enhances noise-robustness while ensuring more efficient computation. Besides, we propose an end-to-end centroid-based awareness module that provides class similarity awareness at the bottleneck level to correct ambiguous cases during prediction. We conducted experiments using real noisy far-field data from the MISP challenge 2021 and achieved SOTA results compared to existing small-footprint KWS models. Our best score of 0.126 is highly competitive against larger models like 3D-ResNet, which is 0.122, but ours is much smaller at 473K compared to 13M.
引用
收藏
页码:296 / 300
页数:5
相关论文
共 50 条
  • [11] A Time Delay Neural Network with Shared Weight Self-Attention for Small-Footprint Keyword Spotting
    Bai, Ye
    Yi, Jiangyan
    Tao, Jianhua
    Wen, Zhengqi
    Tian, Zhengkun
    Zhao, Chenghao
    Fan, Cunhang
    INTERSPEECH 2019, 2019, : 2190 - 2194
  • [12] Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
    Arik, Sercan O.
    Kliegl, Markus
    Child, Rewon
    Hestness, Joel
    Gibiansky, Andrew
    Fougner, Chris
    Prenger, Ryan
    Coates, Adam
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1606 - 1610
  • [13] Small-footprint Spiking Neural Networks for Power-efficient Keyword Spotting
    Pedroni, Bruno U.
    Sheik, Sadique
    Mostafa, Hesham
    Paul, Somnath
    Augustine, Charles
    Cauwenberghs, Gert
    2018 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS): ADVANCED SYSTEMS FOR ENHANCING HUMAN HEALTH, 2018, : 591 - 594
  • [14] SMALL-FOOTPRINT KEYWORD SPOTTING USING DEEP NEURAL NETWORKS
    Chen, Guoguo
    Parada, Carolina
    Heigold, Georg
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [15] IMPROVING RNN TRANSDUCER MODELING FOR SMALL-FOOTPRINT KEYWORD SPOTTING
    Tian, Yao
    Yao, Haitao
    Cai, Meng
    Liu, Yaming
    Ma, Zejun
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5624 - 5628
  • [16] Deep Template Matching for Small-footprint and Configurable Keyword Spotting
    Zhang, Peng
    Zhang, Xueliang
    INTERSPEECH 2020, 2020, : 2572 - 2576
  • [17] CONVMIXER: FEATURE INTERACTIVE CONVOLUTION WITH CURRICULUM LEARNING FOR SMALL FOOTPRINT AND NOISY FAR-FIELD KEYWORD SPOTTING
    Ng, Dianwen
    Chen, Yunqi
    Tian, Biao
    Fu, Qiang
    Chng, Eng Siong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3603 - 3607
  • [18] Error-Diffusion Based Speech Feature Quantization for Small-Footprint Keyword Spotting
    Luo, Mengjie
    Wang, Dingyi
    Wang, Xiaoqin
    Qiao, Shushan
    Zhou, Yumei
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1357 - 1361
  • [19] Speech densely connected convolutional networks for small-footprint keyword spotting
    Tsai, Tsung-Han
    Lin, Xin-Hui
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (25) : 39119 - 39137
  • [20] Predicting detection filters for small footprint open-vocabulary keyword spotting
    Bluche, Theodore
    Gisselbrecht, Thibault
    INTERSPEECH 2020, 2020, : 2552 - 2556