Small Footprint Multi-channel Network for Keyword Spotting with Centroid Based Awareness

被引:3
|
作者
Ng, Dianwen [1 ,2 ]
Xiao, Yang [2 ]
Yip, Jia Qi [1 ,2 ]
Yang, Zhao [2 ]
Tian, Biao [1 ]
Fu, Qiang [1 ]
Chng, Eng Siong [2 ]
Ma, Bin [1 ]
机构
[1] Alibaba Grp, Speech Lab DAMO Acad, Hangzhou, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
来源
INTERSPEECH 2023 | 2023年
关键词
Small Footprint; Keyword Spotting; Multichannel; Noisy Far-field; Centroid Awareness;
D O I
10.21437/Interspeech.2023-1210
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Spoken Keyword Spotting (KWS) in noisy far-field environments is challenging for small-footprint models, given the restrictions on computational resources (e.g., model size, running memory). This is even more intricate when handling noises from multiple microphones. To address this, we present a new multi-channel model that uses a CNN-based network with a linear mixing unit to achieve local-global dependency representations. Our method enhances noise-robustness while ensuring more efficient computation. Besides, we propose an end-to-end centroid-based awareness module that provides class similarity awareness at the bottleneck level to correct ambiguous cases during prediction. We conducted experiments using real noisy far-field data from the MISP challenge 2021 and achieved SOTA results compared to existing small-footprint KWS models. Our best score of 0.126 is highly competitive against larger models like 3D-ResNet, which is 0.122, but ours is much smaller at 473K compared to 13M.
引用
收藏
页码:296 / 300
页数:5
相关论文
共 50 条
  • [31] Reduced Model Size Deep Convolutional Neural Networks for Small-Footprint Keyword Spotting
    Tsai, Tsung Han
    Lin, Xin Hui
    2021 28TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS, AND SYSTEMS (IEEE ICECS 2021), 2021,
  • [32] TSDNet: An Efficient Light Footprint Keyword Spotting Deep Network Base on Tempera Segment Normalization
    Chen, Fei
    Xue, Hui
    Fang, Pengfei
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 335 - 340
  • [33] DLiGRU-X: Efficient X-Vector-Based Embeddings for Small-Footprint Keyword Spotting System
    Wu, Zong-En
    Chan, Shao-Jung
    Wubet, Yeshanew Ale
    Lian, Kuang-Yow
    IEEE ACCESS, 2025, 13 : 23498 - 23507
  • [34] A Depthwise Separable Convolution Neural Network for Small-footprint Keyword Spotting Using Approximate MAC Unit and Streaming Convolution Reuse
    Lu, Yicheng
    Shan, Weiwei
    Xu, Jiaming
    2019 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS 2019), 2019, : 309 - 312
  • [35] Multi-Resolution Stacked 1D-CNN for Small-Footprint keyword Spotting with Two-Stage Detection
    Tang, Jian
    Xue, Shaofei
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 310 - 314
  • [36] Small-Footprint Keyword Spotting for Controlling Smart Home Appliances Using TCN and CRNN Models
    Alapati, Hemalatha
    Paolini, Christopher
    Chinara, Suchismita
    Sarkar, Mahasweta
    INTERNATIONAL JOURNAL OF INTERDISCIPLINARY TELECOMMUNICATIONS AND NETWORKING, 2022, 14 (01)
  • [37] FOCAL LOSS AND DOUBLE-EDGE-TRIGGERED DETECTOR FOR ROBUST SMALL-FOOTPRINT KEYWORD SPOTTING
    Liu, Bin
    Nie, Shuai
    Zhang, Yaping
    Liang, Shan
    Yang, Zhanlei
    Liu, Wenju
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6361 - 6365
  • [38] Depthwise Separable Convolutional ResNet with Squeeze-and-Excitation Blocks for Small-footprint Keyword Spotting
    Xu, Menglong
    Zhang, Xiao-Lei
    INTERSPEECH 2020, 2020, : 2547 - 2551
  • [39] MAX-POOLING LOSS TRAINING OF LONG SHORT-TERM MEMORY NETWORKS FOR SMALL-FOOTPRINT KEYWORD SPOTTING
    Sun, Ming
    Raju, Anirudh
    Tucker, George
    Panchapagesan, Sankaran
    Fu, Gengshen
    Mandal, Arindam
    Matsoukas, Spyros
    Strom, Nikko
    Vitaladevuni, Shiv
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 474 - 480
  • [40] A Fast Fuzzy Keyword Spotting Algorithm Based on Syllable Confusion Network
    Shao, Jian
    Zhao, Qingwei
    Zhang, Pengyuan
    Liu, Zhaojie
    Yan, Yonghong
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1665 - 1668