Pitch- and Formant-Based Order Adaptation of the Fractional Fourier Transform and Its Application to Speech Recognition

被引：2

作者：

Yin, Hui ^{[1
,2
]}

Nadeu, Climent ^{[1
]}

Hohmann, Volker ^{[1
,3
]}

机构：

[1] Univ Politecn Cataluna, TALP Res Ctr, ES-08034 Barcelona, Spain

[2] Beijing Inst Technol, Dept Elect Engn, Beijing 100081, Peoples R China

[3] Carl von Ossietzky Univ Oldenburg, D-26111 Oldenburg, Germany

来源：

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2009年

关键词：

FREQUENCY; AMPLITUDE; SIGNALS;

D O I：

10.1155/2009/304579

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Fractional Fourier transform(FrFT) has been proposed to improve the time-frequency resolution in signal analysis and processing. However, selecting the FrFT transform order for the proper analysis of multicomponent signals like speech is still debated. In this work, we investigated several order adaptation methods. Firstly, FFT-and FrFT-based spectrograms of an artificially-generated vowel are compared to demonstrate the methods. Secondly, an acoustic feature set combining MFCC and FrFT is proposed, and the transform orders for the FrFT are adaptively set according to various methods based on pitch and formants. A tonal vowel discrimination test is designed to compare the performance of these methods using the feature set. The results show that the FrFT-MFCC yields a better discriminability of tones and also of vowels, especially by using multitransform-order methods. Thirdly, speech recognition experiments were conducted on the clean intervocalic English consonants provided by the Consonant Challenge. Experimental results show that the proposed features with different order adaptation methods can obtain slightly higher recognition rates compared to the reference MFCC-based recognizer. Copyright (c) 2009 Hui Yin et al.

引用

页数：14

共 45 条

[1] AINSLEIGH PL, 2000, P IEEE INT C AC SPEE, V2, P665
[2] On fractional Fourier transform moments
Alieva, T
Bastiaans, MJ
[J]. IEEE SIGNAL PROCESSING LETTERS, 2000, 7 (11) : 320 - 323
[3] THE FRACTIONAL FOURIER-TRANSFORM AND TIME-FREQUENCY REPRESENTATIONS
ALMEIDA, LB
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1994, 42 (11) : 3084 - 3091
[4] [Anonymous], 1995, SPEECH CODING SYNTHE
[5] [Anonymous], 2008, Hidden Markov Model Toolkit
[6] ANALYSIS OF MULTICOMPONENT LFM SIGNALS BY A COMBINED WIGNER-HOUGH TRANSFORM
BARBAROSSA, S
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1995, 43 (06) : 1511 - 1515
[7] Estimation of frequency for AM/FM models using the phase vocoder framework
Betser, Michael
Collen, Patrice
Richard, Gael
David, Bertrand
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2008, 56 (02) : 505 - 517
[8] CHAO YR, 1968, SRAMMER SPOKEN CHINE
[9] COOKE M, 2008, P 9 ANN C INT SPEECH
[10] Robust AM-FM features for speech recognition
Dimitriadis, D
Maragos, P
Potamianos, A
[J]. IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (09) : 621 - 624

← 1 2 3 4 5 →