Comparison of AM-FM Based Features For Robust Speech Recognition

被引:0
作者
Narayana, K. V. S. [1 ]
Sreenivas, T. V. [1 ]
机构
[1] Indian Inst Sci, Dept Elect & Commun Engg, Bangalore 560012, Karnataka, India
来源
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 | 2008年
关键词
ASR; AM-FM modeling;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Effective feature extraction for robust speech recognition is a widely addressed topic and currently there is much effort to invoke non-stationary signal models instead of quasi-stationary signal models leading to standard features such as LPC or MFCC. Joint amplitude modulation and frequency modulation (AM-FM) is a classical non-parametric approach to non-stationary signal modeling and recently new feature sets for automatic speech recognition (ASR) have been derived based on a multi-band AM-FM representation of the signal. We consider several of these representations and compare their performances for robust speech recognition in noise, using the AURORA-2 database. We show that FEPSTRUM representation proposed is more effective than others. We also propose an improvement to FEPSTRUM based on the Teager energy operator (TEO) and show that it can selectively outperform even FEPSTRUM.
引用
收藏
页码:1545 / 1548
页数:4
相关论文
共 16 条
[1]  
COHEN L, TIME FREQUENCY ANAL
[2]   Robust AM-FM features for speech recognition [J].
Dimitriadis, D ;
Maragos, P ;
Potamianos, A .
IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (09) :621-624
[3]  
DIMITRIADIS D, 2003, P SIGN SYST COMP 200, V2, P2078
[4]  
Greenberg S, 1997, INT CONF ACOUST SPEE, P1647, DOI 10.1109/ICASSP.1997.598826
[5]  
Grenier Y., 1983, IEEE T ASSP, V31
[6]  
HANSEN JC, 2003, THESIS U RHODE ISLAN
[7]  
HIRSCH HG, 2000, AURORA EXPT FRAMEWOR
[8]  
Hu YB, 2005, Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, P3025
[9]   Teager energy based feature parameters for speech recognition in car noise [J].
Jabloun, F ;
Çetin, AE ;
Erzin, E .
IEEE SIGNAL PROCESSING LETTERS, 1999, 6 (10) :259-261
[10]  
KAISER JF, 1990, INT CONF ACOUST SPEE, P381, DOI 10.1109/ICASSP.1990.115702