Intra- and Inter-frame Features for Automatic Speech Recognition

被引：13

作者：

Lee, Sung Joo ^{[1
,2
]}

Kang, Byung Ok ^{[1
,2
]}

Chung, Hoon ^{[1
,2
]}

Lee, Yunkeun ^{[1
,2
]}

机构：

[1] ETRI, SW Content Res Lab, Taejon, South Korea

[2] Univ Sci & Technol, Dept Broadband Network Technol, Taejon, South Korea

来源：

ETRI JOURNAL | 2014年 / 36卷 / 03期

关键词：

Speech recognition; feature extraction; WORD RECOGNITION;

D O I：

10.4218/etrij.14.0213.0181

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, alternative dynamic features for speech recognition are proposed The goal of this work is to improve speech recognition accuracy by deriving the representation of distinctive dynamic characteristics from a speech spectrum. This work was inspired by two temporal dynamics of a speech signal. One is the highly non-stationary nature of speech, and the other is the inter-frame change of a speech spectrum. We adopt the use of a sub-frame spectrum analyzer to capture very rapid spectral changes' within a speech analysis frame. In addition, we attempt to measure spectral fluctuations of a more complex manner as opposed to traditional dynamic features such as delta or double-delta. To evaluate the proposed features, speech recognition tests over smartphone environments were conducted The experimental results show that the feature streams simply combined with the proposed features are effective for an improvement in the recognition accuracy of a hidden Markov model based speech recognizer.

引用

页码：514 / 517

页数：4

共 8 条

[1] Abdulla W.H., 2002, Advances in Communications and Software Technologies, P231
[2] COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES
DAVIS, SB
MERMELSTEIN, P
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04): : 357 - 366
[3] SPEAKER-INDEPENDENT ISOLATED WORD RECOGNITION USING DYNAMIC FEATURES OF SPEECH SPECTRUM
FURUI, S
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1986, 34 (01): : 52 - 59
[4] PERCEPTUAL LINEAR PREDICTIVE (PLP) ANALYSIS OF SPEECH
HERMANSKY, H
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1990, 87 (04) : 1738 - 1752
[5] Auditory processing of speech signals for robust speech recognition in real-world noisy environments
Kim, DS
Lee, SY
Kil, RM
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (01): : 55 - 69
[6] Statistical Model-Based Noise Reduction Approach for Car Interior Applications to Speech Recognition
Lee, Sung Joo
Kang, Byung Ok
Jung, Ho-Young
Lee, Yunkeun
Kim, Hyung Soon
[J]. ETRI JOURNAL, 2010, 32 (05) : 801 - 809
[7] Milner B, 2002, INT CONF ACOUST SPEE, P797
[8] Young S., 2006, HTK BOOK HTK VERSION

← 1 →