Robust ASR Based on ETSI Advanced Front-End Using Complex Speech Analysis

被引：1

作者：

Higa, Keita ^{[1
]}

Funaki, Keiichi ^{[2
]}

机构：

[1] Univ Ryukyus, Sch Engn & Sci, Nakagami, Okinawa 9030213, Japan

[2] Univ Ryukyus, C&N Ctr, Nakagami, Okinawa 9030213, Japan

来源：

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES | 2015年 / E98A卷 / 11期

关键词：

robust ASR; ETSI AFE; iterative Wiener filter (IWF); complex speech analysis; analytic signal; RECOGNITION; ENHANCEMENT;

D O I：

10.1587/transfun.E98.A.2211

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The advanced front-end (AFE) for automatic speech recognition (ASR) was standardized by the European Telecommunications Standards Institute (ETSI). The AFE provides speech enhancement realized by an iterative Wiener filter (IWF) in which a smoothed FFT spectrum over adjacent frames is used to design the filter. We have previously proposed robust time-varying complex Auto-Regressive (TV-CAR) speech analysis for an analytic signal and evaluated the performance of speech processing such as F-0 estimation and speech enhancement. TV-CAR analysis can estimate more accurate spectrum than FFT, especially in low frequencies because of the nature of the analytic signal. In addition, TV-CAR can estimate more accurate speech spectrum against additive noise. In this paper, a time-invariant version of wide-band TV-CAR analysis is introduced to the IWF in the AFE and is evaluated using the CENSREC-2 database and its baseline script.

引用

页码：2211 / 2219

页数：9

共 19 条

[1] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].

BOLL, SF .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120

[2] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR [J].

EPHRAIM, Y ;

MALAH, D .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02) :443-445

[3]

ETSI Advanced Front-End, 2007, 202050 ES ETSI

[4]

Funaki K., 2008, EUSIPCO 2008

[5]

Funaki K., 1998, EUSIPCO 98, P1177

[6]

Funaki K., 2001, P EUROSPEECH 2001, P2649

[7] CONSTRAINED ITERATIVE SPEECH ENHANCEMENT WITH APPLICATION TO SPEECH RECOGNITION [J].

HANSEN, JHL ;

CLEMENTS, MA .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1991, 39 (04) :795-805

[8] RASTA Processing of Speech [J].

Hermansky, Hynek ;

Morgan, Nelson .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (04) :578-589

[9]

Keronen S., 2011, P INTERSPEECH, P1265

[10]

Kim C, 2012, INT CONF ACOUST SPEE, P4101, DOI 10.1109/ICASSP.2012.6288820

← 1 2 →