VERIFYING DEEP KEYWORD SPOTTING DETECTION WITH ACOUSTIC WORD EMBEDDINGS

被引:0
|
作者
Yuan, Yougen [1 ,2 ]
Lv, Zhiqiang [2 ]
Huang, Shen [2 ]
Xie, Lei [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian, Peoples R China
[2] Tencent Res, Beijing, Peoples R China
来源
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019) | 2019年
基金
中国国家自然科学基金;
关键词
Query-by-example; keyword spotting; acoustic word embeddings; hinge loss; calibration scores;
D O I
10.1109/asru46091.2019.9003781
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, in order to improve keyword spotting (KWS) performance in a live broadcast scenario, we propose to use a template matching method based on acoustic word embeddings (AWE) as the second stage to verify the detection from the Deep KWS system. AWEs are obtained via a deep bidirectional long short-term memory (BLSTM) network trained using limited positive and negative keyword candidates, which aims to encode variable-length keyword candidates into fixed-dimensional vectors with reasonable discriminative ability. Learning AWEs takes a combination of three specifically-designed losses: the triplet and reversed triplet losses try to keep same keyword candidates closer and different keyword candidates farther, while the hinge loss is to set a fixed threshold to distinguish all positive and negative keyword candidates. During keyword verification, calibration scores are used to reduce the bias between different templates for different keyword candidates. Experiments show that adding AWE-based keyword verification to Deep KWS achieves 5.6% relative accuracy improvement; the hinge loss brings additional 5.5% relative gain and the final accuracy climbs to 0.775 by using calibration scores.
引用
收藏
页码:613 / 620
页数:8
相关论文
共 50 条
  • [41] Keyword Spotting in the Homomorphic Encrypted Domain Using Deep Complex-Valued CNN
    Zheng, Peijia
    Cai, Zhiwei
    Zeng, Huicong
    Huang, Jiwu
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1474 - 1483
  • [42] Non-Uniform Boosted MCE Training of Deep Neural Networks for Keyword Spotting
    Meng, Zhong
    Juang, Biing-Hwang
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 770 - 774
  • [43] DLiGRU-X: Efficient X-Vector-Based Embeddings for Small-Footprint Keyword Spotting System
    Wu, Zong-En
    Chan, Shao-Jung
    Wubet, Yeshanew Ale
    Lian, Kuang-Yow
    IEEE ACCESS, 2025, 13 : 23498 - 23507
  • [44] Learning Acoustic Word Embeddings With Dynamic Time Warping Triplet Networks
    Shitov, Denis
    Pirogova, Elena
    Wysocki, Tadeusz A.
    Lech, Margaret
    IEEE ACCESS, 2020, 8 (08): : 103327 - 103338
  • [45] Do Acoustic Word Embeddings Capture Phonological Similarity? An Empirical Study
    Abdullah, Badr M.
    Mosbach, Marius
    Zaitova, Iuliia
    Moebius, Bernd
    Klakow, Dietrich
    INTERSPEECH 2021, 2021, : 4194 - 4198
  • [46] Asymmetric Proxy Loss for Multi-View Acoustic Word Embeddings
    Jung, Myunghun
    Kim, Hoi Rin
    INTERSPEECH 2022, 2022, : 5170 - 5174
  • [47] A Deep Learning-Based Noise-Resilient Keyword Spotting Engine for Embedded Platforms
    Abdelmoula, Ramzi
    Khamis, Alaa
    Karray, Fakhri
    IMAGE ANALYSIS AND RECOGNITION (ICIAR 2019), PT II, 2019, 11663 : 134 - 146
  • [48] TSDNet: An Efficient Light Footprint Keyword Spotting Deep Network Base on Tempera Segment Normalization
    Chen, Fei
    Xue, Hui
    Fang, Pengfei
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 335 - 340
  • [49] Development and Optimization of an Ultra-lightweight Deep Spoken Keyword Spotting Model for FPGA Acceleration
    Dembeck, Trysten
    Parikh, Chirag
    COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, CAINE 2024, 2025, 2242 : 3 - 20
  • [50] Efficient Keyword Spotting through Hardware-Aware Conditional Execution of Deep Neural Networks
    Giraldo, J. S. P.
    O'Connor, Chris
    Verhelst, Marian
    2019 IEEE/ACS 16TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA 2019), 2019,