Custom Mandarin Keyword Spotting with Extended Long Short-Term Memory

被引：0

作者：

Cao, Haitao ^{[1
]}

Liu, Xi ^{[1
]}

Tan, Zhiguo ^{[1
]}

Yang, Zhenlun ^{[1
]}

Qin, Xin ^{[2
]}

机构：

[1] School of Information Engineering, Guangzhou Panyu Polytechnic, Guangzhou,511483, China

[2] Institute of Big Data and Internet Innovation, Hunan University of Technology and Business, Changsha,410205, China

来源：

IAENG International Journal of Computer Science | 2024年 / 51卷 / 12期

关键词：

Deep neural networks;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In real-world scenarios, Deep Neural Network (DNN)-powered Keyword Spotting (KWS) systems are typically engineered as lightweight architectures, optimizing for superior performance and low computational complexity in resource-limited devices. However, such lightweight designs often encounter limitations in generalization, particularly when it comes to customizing keywords. This paper presents a twostage method to customize a Mandarin KWS system rapidly. First, we propose an embedding model to learn the embedding representations of general Mandarin keywords. Subsequently, we facilitate keyword customization with the generalization capability of embedding models through few-shot transfer learning. To improve performance further, in the embedding model, we introduce two scale blocks to fuse acoustic features and employ an Enhanced Extended Long Short-Term Memory (ExLSTM) as the backbone. Experimental results on both English and Mandarin keyword datasets highlight the advantages of the proposed embedding model. In addition, we conduct keyword customization on a self-recorded dataset containing 10 Mandarin keywords. The impressive average accuracy of 97.45% with merely five target samples demonstrates the effectiveness of our method. © (2024), (International Association of Engineers). All rights reserved.

引用

页码：1933 / 1942

共 35 条

[31] Long Short Term Memory (LSTM) based Deep Learning for Sentiment Analysis of English and Spanish Data
Saha, Baidya Nath
Senapati, Apurbalal
2020 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2020), 2020, : 442 - 446
[32] Rotation Speed Estimation for Motor Fault Diagnosis Using Bidirectional Long Short Term Memory Networks
Wang, Xiaoxian
Lu, Siliang
Zhang, Shiwu
2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 535 - 539
[33] Potato late blight (Phytophthora infestans) disease forecasting using an auto-encoded long short-term memory recurrent neural networks in North-Western Algeria
Abderrahmane, Omar
Berdja, Rafik
Ammad, Faiza
Bensaci, Oussama Ali
Benchabane, Messaoud
ARCHIVES OF PHYTOPATHOLOGY AND PLANT PROTECTION, 2022, 55 (13) : 1542 - 1557
[34] Hybrid forecasting model based on long short term memory network and deep learning neural network for wind signal
Qin, Yong
Li, Kun
Liang, Zhanhao
Lee, Brendan
Zhang, Fuyong
Gu, Yongcheng
Zhang, Lei
Wu, Fengzhi
Rodriguez, Dragan
APPLIED ENERGY, 2019, 236 : 262 - 272
[35] Cochleogram-Based Speech Emotion Recognition with the Cascade of Asymmetric Resonators with Fast-Acting Compression Using Time-Distributed Convolutional Long Short-Term Memory and Support Vector Machines
Parlak, Cevahir
BIOMIMETICS, 2025, 10 (03)

← 1 2 3 4 →