Small Footprint Multi-channel Network for Keyword Spotting with Centroid Based Awareness

被引:3
作者
Ng, Dianwen [1 ,2 ]
Xiao, Yang [2 ]
Yip, Jia Qi [1 ,2 ]
Yang, Zhao [2 ]
Tian, Biao [1 ]
Fu, Qiang [1 ]
Chng, Eng Siong [2 ]
Ma, Bin [1 ]
机构
[1] Alibaba Grp, Speech Lab DAMO Acad, Hangzhou, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
来源
INTERSPEECH 2023 | 2023年
关键词
Small Footprint; Keyword Spotting; Multichannel; Noisy Far-field; Centroid Awareness;
D O I
10.21437/Interspeech.2023-1210
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Spoken Keyword Spotting (KWS) in noisy far-field environments is challenging for small-footprint models, given the restrictions on computational resources (e.g., model size, running memory). This is even more intricate when handling noises from multiple microphones. To address this, we present a new multi-channel model that uses a CNN-based network with a linear mixing unit to achieve local-global dependency representations. Our method enhances noise-robustness while ensuring more efficient computation. Besides, we propose an end-to-end centroid-based awareness module that provides class similarity awareness at the bottleneck level to correct ambiguous cases during prediction. We conducted experiments using real noisy far-field data from the MISP challenge 2021 and achieved SOTA results compared to existing small-footprint KWS models. Our best score of 0.126 is highly competitive against larger models like 3D-ResNet, which is 0.122, but ours is much smaller at 473K compared to 13M.
引用
收藏
页码:296 / 300
页数:5
相关论文
共 50 条
  • [41] Robust Small-Footprint Keyword Spotting Using Sequence-To-Sequence Model With Connectionist Temporal Classifier
    Xuan, Xiaoguang
    Wang, Mingjiang
    Zhang, Xin
    Sun, Fengjiao
    2019 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP), 2019, : 400 - 404
  • [42] Network layer Negotiation-based Channel Assignment in multi-channel wireless networks
    Qu, Yi
    Lung, Chung-Horng
    Srinivasan, Anand
    GLOBECOM 2007: 2007 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, VOLS 1-11, 2007, : 759 - +
  • [43] AUDIO-VISUAL KEYWORD SPOTTING BASED ON MULTIDIMENSIONAL CONVOLUTIONAL NEURAL NETWORK
    Ding, Runwei
    Pang, Cheng
    Liu, Hong
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 4138 - 4142
  • [44] A Multi-Channel Contrastive Learning Network Based Intrusion Detection Method
    Luo, Jian
    Zhang, Yiying
    Wu, Yannian
    Xu, Yao
    Guo, Xiaoyan
    Shang, Boxiang
    ELECTRONICS, 2023, 12 (04)
  • [45] A Multi-MAC Based Multi-Channel OLSR for Wireless Ad hoc Network
    Xiang, Zheng
    Fang, Xuming
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 1620 - 1623
  • [46] Constrained Channel Assignment in Multi-channel Wireless Mesh Network
    Salleh, Shaharuddin
    Salahudin, Nur Atikah
    JURNAL TEKNOLOGI, 2014, 66 (01):
  • [47] A multi-channel load awareness-based MAC protocol for flying ad hoc networks
    Bo Zheng
    Yong Li
    Wei Cheng
    Huaxin Wu
    Weilun Liu
    EURASIP Journal on Wireless Communications and Networking, 2020
  • [48] Multi-channel Assignment in Multi-hop Wireless Network
    ZHAO ChuanxinWANG RuchuanCollege of ComputerNanjing University of Posts and TelecommunicationsNanjing ChinaSchool of Computer Science and TechnologySoochow UniversitySuzhou China
    南京邮电大学学报(自然科学版), 2010, 30 (01) : 18 - 25
  • [49] A multi-channel load awareness-based MAC protocol for flying ad hoc networks
    Zheng, Bo
    Li, Yong
    Cheng, Wei
    Wu, Huaxin
    Liu, Weilun
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2020, 2020 (01)
  • [50] Simulation of Multi-channel Active Noise Control Based on Dynamic Neural Network
    Wang, Bing
    Zi, Keming
    ADVANCED MECHANICAL DESIGN, PTS 1-3, 2012, 479-481 : 1293 - 1296