VERIFYING DEEP KEYWORD SPOTTING DETECTION WITH ACOUSTIC WORD EMBEDDINGS

被引:0
|
作者
Yuan, Yougen [1 ,2 ]
Lv, Zhiqiang [2 ]
Huang, Shen [2 ]
Xie, Lei [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian, Peoples R China
[2] Tencent Res, Beijing, Peoples R China
来源
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019) | 2019年
基金
中国国家自然科学基金;
关键词
Query-by-example; keyword spotting; acoustic word embeddings; hinge loss; calibration scores;
D O I
10.1109/asru46091.2019.9003781
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, in order to improve keyword spotting (KWS) performance in a live broadcast scenario, we propose to use a template matching method based on acoustic word embeddings (AWE) as the second stage to verify the detection from the Deep KWS system. AWEs are obtained via a deep bidirectional long short-term memory (BLSTM) network trained using limited positive and negative keyword candidates, which aims to encode variable-length keyword candidates into fixed-dimensional vectors with reasonable discriminative ability. Learning AWEs takes a combination of three specifically-designed losses: the triplet and reversed triplet losses try to keep same keyword candidates closer and different keyword candidates farther, while the hinge loss is to set a fixed threshold to distinguish all positive and negative keyword candidates. During keyword verification, calibration scores are used to reduce the bias between different templates for different keyword candidates. Experiments show that adding AWE-based keyword verification to Deep KWS achieves 5.6% relative accuracy improvement; the hinge loss brings additional 5.5% relative gain and the final accuracy climbs to 0.775 by using calibration scores.
引用
收藏
页码:613 / 620
页数:8
相关论文
共 50 条
  • [31] Multilingual Jointly Trained Acoustic and Written Word Embeddings
    Hu, Yushi
    Settle, Shane
    Livescu, Karen
    INTERSPEECH 2020, 2020, : 1052 - 1056
  • [32] Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili
    Jacobs, Christiaan
    Carraz Rakotonirina, Nathanael
    Chimoto, Everlyn Asiko
    Bassett, Bruce A.
    Kamper, Herman
    INTERSPEECH 2023, 2023, : 436 - 440
  • [33] Emotion Detection in Blog Posts Using Keyword Spotting and Semantic Analysis
    Samonte, Mary Jane C.
    Santiago, Richard Julian Paul G.
    Punzalan, Hector Irvin B.
    Linchangco, Peter Joshua L.
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON COMMUNICATION AND INFORMATION PROCESSING (ICCIP 2017), 2017, : 6 - 13
  • [34] EdgeCRNN: an edge-computing oriented model of acoustic feature enhancement for keyword spotting
    Yungen Wei
    Zheng Gong
    Shunzhi Yang
    Kai Ye
    Yamin Wen
    Journal of Ambient Intelligence and Humanized Computing, 2022, 13 : 1525 - 1535
  • [35] EdgeCRNN: an edge-computing oriented model of acoustic feature enhancement for keyword spotting
    Wei, Yungen
    Gong, Zheng
    Yang, Shunzhi
    Ye, Kai
    Wen, Yamin
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2022, 13 (03) : 1525 - 1535
  • [36] A Review of Deep Learning Techniques in Document Image Word Spotting
    Kumari, Lalita
    Sharma, Anuj
    ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2022, 29 (02) : 1085 - 1106
  • [37] A Review of Deep Learning Techniques in Document Image Word Spotting
    Lalita Kumari
    Anuj Sharma
    Archives of Computational Methods in Engineering, 2022, 29 : 1085 - 1106
  • [38] Leveraging Multilingual Transfer for Unsupervised Semantic Acoustic Word Embeddings
    Jacobs, Christiaan
    Kamper, Herman
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 311 - 315
  • [39] Multitask Learning of Deep Neural Network-Based Keyword Spotting for IoT Devices
    Leem, Seong-Gyun
    Yoo, In-Chul
    Yook, Dongsuk
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2019, 65 (02) : 188 - 194
  • [40] Predicting detection filters for small footprint open-vocabulary keyword spotting
    Bluche, Theodore
    Gisselbrecht, Thibault
    INTERSPEECH 2020, 2020, : 2552 - 2556