Hardware-Software Codesign of Automatic Speech Recognition System for Embedded Real-Time Applications

被引：29

作者：

Cheng, Octavian ^{[1
]}

Abdulla, Waleed ^{[1
]}

Salcic, Zoran ^{[1
]}

机构：

[1] Univ Auckland, Dept Elect & Comp Engn, Auckland 1142, New Zealand

来源：

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS | 2011年 / 58卷 / 03期

关键词：

Automatic speech recognition (ASR); embedded system; hardware-software codesign; real-time system; softcore-based system; DESIGN;

D O I：

10.1109/TIE.2009.2022520

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We present a hardware-software coprocessing speech recognizer for real-time embedded applications. The system consists of a standard microprocessor and a hardware accelerator for Gaussian mixture model (GMM) emission probability calculation implemented on a field-programmable gate array. The GMM accelerator is optimized for timing performance by exploiting data parallelism. In order to avoid large memory requirement, the accelerator adopts a double buffering scheme for accessing the acoustic parameters with no assumption made on the access pattern of these parameters. Experiments on widely used benchmark data show that the real-time factor of the proposed system is 0.62, which is about three times faster than the pure software-based baseline system, while the word accuracy rate is preserved at 93.33%. As a part of the recognizer, a new adaptive beam-pruning algorithm is also proposed and implemented, which further reduces the average real-time factor to 0.54 with the word accuracy rate of 93.16%. The proposed speech recognizer is suitable for integration in various types of voice (speech)-controlled applications.

引用

页码：850 / 859

页数：10

共 27 条

[1]

*ALT CORP, 2005, NIOS DEV BOARD REF M

[2]

*ALT CORP, 2006, NIOS 2 PROC REF HDB

[3]

[Anonymous], 1989, Token passing: A simple conceptual model for connected speech recognition systems

[4]

[Anonymous], IEEE INT C AC SPEECH

[5]

Bocchieri E, 2006, INT CONF ACOUST SPEE, P1113

[6] A particle-swarm-optimized fuzzy-neural network for voice-controlled robot systems [J].

Chatterjee, A ;

Pulasinghe, K ;

Watanabe, K ;

Izumi, K .

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2005, 52 (06) :1478-1489

[7] COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].

DAVIS, SB ;

MERMELSTEIN, P .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366

[8] Designing for learnability in human-robot communication [J].

Green, A ;

Eklundh, KS .

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2003, 50 (04) :644-650

[9] Robust speech dialog interface for car telematics service [J].

Hataoka, N ;

Obuchi, Y ;

Mitamura, T ;

Nyberg, E .

CCNC 2004: 1ST IEEE CONSUMER COMMUNICATIONS AND NETWORKING CONFERENCE, PROCEEDINGS: CONSUMER NETWORKING: CLOSING THE DIGITAL DIVIDE, 2004, :331-335

[10]

Huggins-Daines D, 2006, INT CONF ACOUST SPEE, P185

← 1 2 3 →