Subband feature extraction using lapped orthogonal transform for speech recognition

被引：0

作者：

Tufekci, Z ^{[1
]}

Gowdy, JN ^{[1
]}

机构：

[1] Clemson Univ, Dept Elect & Comp Engn, Clemson, SC 29634 USA

来源：

2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM | 2001年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

It is well known that dividing speech into frequency subbands can improve the performance of a speech recognizer. This is especially true for the case of speech corrupted with noise. Subband (SUB) features are typically extracted by dividing the frequency band into subbands by using non-overlapping rectangular windows and then processing each subband s spectrum separately. However, multiplying a signal by a rectangular window creates discontinuities which produce large amplitude frequency coefficients at high frequencies that degrade the performance of the speech recognizer. In this paper we propose the Lapped Subband (LAP) features which are calculated by applying the Discrete Orthogonal Lapped Transform (DOLT) to the mel-scaled, log-Iterbank energies of a speech frame. Performance of the LAP features was evaluated on a phoneme recognition task and compared with the performance of SUB features and MFCC features. Experimental results have shown that the proposed LAP features outperform SUB features and Mel Frequency Cepstral Coefficients (MFCC) features under white noise, band-limited white noise and no noise conditions.

引用

页码：149 / 152

页数：4

共 50 条

[1] TRANSFORM SUBBAND CODING OF SPEECH WITH THE LAPPED ORTHOGONAL TRANSFORM
MALVAR, HS
DUARTE, R
1989 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-3, 1989, : 1268 - 1271
[2] Generalized Unequal Length Lapped Orthogonal Transform for subband image coding
Nagai, T
Ikehara, M
Kaneko, M
Kurematsu, A
2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 520 - 523
[3] Generalized unequal length lapped orthogonal transform for subband image coding
Nagai, T
Ikehara, M
Kaneko, M
Kurematsu, A
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2000, 48 (12) : 3365 - 3378
[4] Speech recognition using the extraction of particular feature by the discrete wavelet transform
Midorikawa, Y. (ymido@cc.oita-u.ac.jp), 2001, IOS Press (13) : 1 - 4
[5] Speech recognition using the extraction of particular feature by the discrete wavelet transform
Midorikawa, Y
Akita, M
INTERNATIONAL JOURNAL OF APPLIED ELECTROMAGNETICS AND MECHANICS, 2001, 13 (1-4) : 13 - 18
[6] VIDEO CODING USING LAPPED ORTHOGONAL TRANSFORM
JOZAWA, H
WATANABE, H
NTT REVIEW, 1993, 5 (02): : 91 - 96
[7] Video coding using lapped orthogonal transform
Jozawwa, Hirohisa
Watanabe, Hiroshi
NTT R and D, 1993, 42 (01): : 71 - 78
[8] LAPPED TRANSFORMS FOR EFFICIENT TRANSFORM SUBBAND CODING
MALVAR, HS
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1990, 38 (06): : 969 - 978
[9] Feature extraction for speech recognition based on orthogonal acoustic-feature planes and LDA
Nitta, T
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 421 - 424
[10] An acoustic echo canceller using lapped orthogonal transform
Wei, CH
Tsai, CP
ISCAS '97 - PROCEEDINGS OF 1997 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS I - IV: CIRCUITS AND SYSTEMS IN THE INFORMATION AGE, 1997, : 2573 - 2576

← 1 2 3 4 5 →