Custom Mandarin Keyword Spotting with Extended Long Short-Term Memory

被引:0
作者
Cao, Haitao [1 ]
Liu, Xi [1 ]
Tan, Zhiguo [1 ]
Yang, Zhenlun [1 ]
Qin, Xin [2 ]
机构
[1] School of Information Engineering, Guangzhou Panyu Polytechnic, Guangzhou,511483, China
[2] Institute of Big Data and Internet Innovation, Hunan University of Technology and Business, Changsha,410205, China
关键词
Deep neural networks;
D O I
暂无
中图分类号
学科分类号
摘要
In real-world scenarios, Deep Neural Network (DNN)-powered Keyword Spotting (KWS) systems are typically engineered as lightweight architectures, optimizing for superior performance and low computational complexity in resource-limited devices. However, such lightweight designs often encounter limitations in generalization, particularly when it comes to customizing keywords. This paper presents a twostage method to customize a Mandarin KWS system rapidly. First, we propose an embedding model to learn the embedding representations of general Mandarin keywords. Subsequently, we facilitate keyword customization with the generalization capability of embedding models through few-shot transfer learning. To improve performance further, in the embedding model, we introduce two scale blocks to fuse acoustic features and employ an Enhanced Extended Long Short-Term Memory (ExLSTM) as the backbone. Experimental results on both English and Mandarin keyword datasets highlight the advantages of the proposed embedding model. In addition, we conduct keyword customization on a self-recorded dataset containing 10 Mandarin keywords. The impressive average accuracy of 97.45% with merely five target samples demonstrates the effectiveness of our method. © (2024), (International Association of Engineers). All rights reserved.
引用
收藏
页码:1933 / 1942
相关论文
共 35 条
  • [31] Long Short Term Memory (LSTM) based Deep Learning for Sentiment Analysis of English and Spanish Data
    Saha, Baidya Nath
    Senapati, Apurbalal
    2020 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2020), 2020, : 442 - 446
  • [32] Rotation Speed Estimation for Motor Fault Diagnosis Using Bidirectional Long Short Term Memory Networks
    Wang, Xiaoxian
    Lu, Siliang
    Zhang, Shiwu
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 535 - 539
  • [33] Potato late blight (Phytophthora infestans) disease forecasting using an auto-encoded long short-term memory recurrent neural networks in North-Western Algeria
    Abderrahmane, Omar
    Berdja, Rafik
    Ammad, Faiza
    Bensaci, Oussama Ali
    Benchabane, Messaoud
    ARCHIVES OF PHYTOPATHOLOGY AND PLANT PROTECTION, 2022, 55 (13) : 1542 - 1557
  • [34] Hybrid forecasting model based on long short term memory network and deep learning neural network for wind signal
    Qin, Yong
    Li, Kun
    Liang, Zhanhao
    Lee, Brendan
    Zhang, Fuyong
    Gu, Yongcheng
    Zhang, Lei
    Wu, Fengzhi
    Rodriguez, Dragan
    APPLIED ENERGY, 2019, 236 : 262 - 272
  • [35] Cochleogram-Based Speech Emotion Recognition with the Cascade of Asymmetric Resonators with Fast-Acting Compression Using Time-Distributed Convolutional Long Short-Term Memory and Support Vector Machines
    Parlak, Cevahir
    BIOMIMETICS, 2025, 10 (03)