Robust ASR Based on ETSI Advanced Front-End Using Complex Speech Analysis

被引:1
作者
Higa, Keita [1 ]
Funaki, Keiichi [2 ]
机构
[1] Univ Ryukyus, Sch Engn & Sci, Nakagami, Okinawa 9030213, Japan
[2] Univ Ryukyus, C&N Ctr, Nakagami, Okinawa 9030213, Japan
关键词
robust ASR; ETSI AFE; iterative Wiener filter (IWF); complex speech analysis; analytic signal; RECOGNITION; ENHANCEMENT;
D O I
10.1587/transfun.E98.A.2211
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The advanced front-end (AFE) for automatic speech recognition (ASR) was standardized by the European Telecommunications Standards Institute (ETSI). The AFE provides speech enhancement realized by an iterative Wiener filter (IWF) in which a smoothed FFT spectrum over adjacent frames is used to design the filter. We have previously proposed robust time-varying complex Auto-Regressive (TV-CAR) speech analysis for an analytic signal and evaluated the performance of speech processing such as F-0 estimation and speech enhancement. TV-CAR analysis can estimate more accurate spectrum than FFT, especially in low frequencies because of the nature of the analytic signal. In addition, TV-CAR can estimate more accurate speech spectrum against additive noise. In this paper, a time-invariant version of wide-band TV-CAR analysis is introduced to the IWF in the AFE and is evaluated using the CENSREC-2 database and its baseline script.
引用
收藏
页码:2211 / 2219
页数:9
相关论文
共 19 条
[1]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[2]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02) :443-445
[3]  
ETSI Advanced Front-End, 2007, 202050 ES ETSI
[4]  
Funaki K., 2008, EUSIPCO 2008
[5]  
Funaki K., 1998, EUSIPCO 98, P1177
[6]  
Funaki K., 2001, P EUROSPEECH 2001, P2649
[7]   CONSTRAINED ITERATIVE SPEECH ENHANCEMENT WITH APPLICATION TO SPEECH RECOGNITION [J].
HANSEN, JHL ;
CLEMENTS, MA .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1991, 39 (04) :795-805
[8]   RASTA Processing of Speech [J].
Hermansky, Hynek ;
Morgan, Nelson .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (04) :578-589
[9]  
Keronen S., 2011, P INTERSPEECH, P1265
[10]  
Kim C, 2012, INT CONF ACOUST SPEE, P4101, DOI 10.1109/ICASSP.2012.6288820