An Anchor-Free Detector for Continuous Speech Keyword Spotting

被引:0
|
作者
Zhao, Zhiyuan [1 ]
Tang, Chuanxin [1 ]
Yao, Chengdong [2 ]
Luo, Chong [1 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
[2] Univ Technol Sydney, Sydney, NSW, Australia
来源
关键词
keyword spotting; continuous speech keyword spotting; speech recognition; anchor-free detector; open dataset;
D O I
10.21437/Interspeech.2022-296
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Continuous Speech Keyword Spotting (CSKWS) is a task to detect predefined keywords in a continuous speech. In this paper, we regard CSKWS as a one-dimensional object detection task and propose a novel anchor-free detector, named AF-KWS, to solve the problem. AF-KWS directly regresses the center locations and lengths of the keywords through a single-stage deep neural network. In particular, AF-KWS is tailored for this speech task as we introduce an auxiliary unknown class to exclude other words from non-speech or silent background. We have built two benchmark datasets named LibriTop-20 and continuous meeting analysis keywords (CMAK) dataset for CSKWS. Evaluations on these two datasets show that our proposed AF-KWS outperforms reference schemes by a large margin, and therefore provides a decent baseline for future research.
引用
收藏
页码:3228 / 3232
页数:5
相关论文
共 50 条
  • [41] Comparison of Keyword Spotting Methods for Searching in Speech
    Smidl, Lubos
    Psutka, Josef V.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1894 - 1897
  • [42] Using keyword spotting and replacement for speech anonymization
    Chen, Jianfeng
    Huy, Dat Tran
    Phua, Koksoon
    Biswas, Jit
    Jayachandran, Maniyeri
    2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 548 - 551
  • [43] Realizing Speech to Gesture Conversion by Keyword Spotting
    Zhao, Na
    Yang, Hongwu
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [44] Binary Speech Features for Keyword Spotting Tasks
    Riviello, Alexandre
    David, Jean-Pierre
    INTERSPEECH 2019, 2019, : 3460 - 3464
  • [45] Speech Keyword Spotting with Rule Based Segmentation
    Greibus, Mindaugas
    Telksnys, Laimutis
    INFORMATION AND SOFTWARE TECHNOLOGIES (ICIST 2013), 2013, 403 : 186 - 197
  • [46] Baseline for Keyword Spotting in Latvian Broadcast Speech
    Dargis, Roberts
    Znotins, Arturs
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2014, 2014, 268 : 75 - 82
  • [47] Prototypical Metric Transfer Learning for Continuous Speech Keyword Spotting with Limited Training Data
    Seth, Harshita
    Kumar, Pulkit
    Srivastava, Muktabh Mayank
    14TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING MODELS IN INDUSTRIAL AND ENVIRONMENTAL APPLICATIONS (SOCO 2019), 2020, 950 : 273 - 280
  • [48] AF-EMS Detector: Improve the Multi-Scale Detection Performance of the Anchor-Free Detector
    Yan, Jiangqiao
    Zhao, Liangjin
    Diao, Wenhui
    Wang, Hongqi
    Sun, Xian
    REMOTE SENSING, 2021, 13 (02) : 1 - 18
  • [49] MEAD: a Mask-guidEd Anchor-free Detector for oriented aerial object detection
    Zewen He
    Zhida Ren
    Xuebing Yang
    Yang Yang
    Wensheng Zhang
    Applied Intelligence, 2022, 52 : 4382 - 4397
  • [50] MEAD: a Mask-guidEd Anchor-free Detector for oriented aerial object detection
    He, Zewen
    Ren, Zhida
    Yang, Xuebing
    Yang, Yang
    Zhang, Wensheng
    APPLIED INTELLIGENCE, 2022, 52 (04) : 4382 - 4397