Efficient Embedded Speech Recognition for Very Large Vocabulary Mandarin Car-Navigation Systems

被引:8
作者
Qian, Yanmin [1 ]
Liu, Jia [1 ]
Johnson, Michael T. [2 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
[2] Marquette Univ, Dept Elect Engn, Milwaukee, WI 53201 USA
基金
国家高技术研究发展计划(863计划); 中国国家自然科学基金;
关键词
Beam-search; Search network; Speech recognition; Word-level pruning; DEVICES;
D O I
10.1109/TCE.2009.5278018
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Automatic speech recognition (ASR) for a very large vocabulary of isolated words is a difficult task on a resource-limited embedded device. This pal)er presents a novel fast decoding algorithm for a mandarin speech recognition system which can simultaneously process hundreds of thousands of items and maintain high recognition accuracy. The proposed algorithm constructs a semi-tree search network based on mandarin pronunciation rules, to avoid duplicate syllable matching and save redundant memory. Based on a two-stage fixed-width beam-search baseline system, the algorithm employs a variable beam-width pruning strategy and a frame-synchronous word-level priming strategy to significantly reduce recognition time. This algorithm is aimed at an in-car navigation system in China and simulated on a standard PC workstation. The experimental results show, that the proposed method reduces recognition time by nearly 6-fold and memory size nearly 2-fold compared to the baseline system, and causes less than 1% accuracy degradation for a 200, 000 word recognition task.
引用
收藏
页码:1496 / 1500
页数:5
相关论文
共 9 条
[1]   Fast speech recognition to access a very large list of items on embedded devices [J].
Chung, Hoon ;
Park, Jeon Gue ;
Lee, Yun Keun ;
Chung, Ikjoo .
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2008, 54 (02) :803-807
[2]  
Chung H, 2006, IEEE T CONSUM ELECTR, V52, P792
[3]  
COHEN J, 2008, P ICASSP LAS VEG US, P5352
[4]  
LEVY C, 2004, P ICASSP MONTR CAN, V5, P309
[5]  
LIM BP, 2005, P ICASSP PIL US, V1, P577
[6]  
NOVAK M, 2003, P ICASSP, V1, P200
[7]   Single-chip speech recognition system based on 8051 microcontroller core [J].
Shi, YY ;
Liu, J ;
Liu, RS .
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2001, 47 (01) :149-153
[8]  
ZHU X, 2004, CHINESE ACTA ELECT S, V32, P150
[9]  
ZHU X, 2002, P INT S CHIN SPOK LA, P83