SINGLE AND MULTI-CHANNEL APPROACHES FOR DISTANT SPEECH RECOGNITION UNDER NOISY REVERBERANT CONDITIONS: I2R'S SYSTEM DESCRIPTION FOR THE ASpIRE CHALLENGE

被引：0

作者：

Dennis, Jonathan ^{[1
]}

Tran Huy Dat ^{[1
]}

机构：

[1] ASTAR, Inst Infocomm Res, 1 Fusionopolis Way, Singapore 138632, Singapore

来源：

2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU) | 2015年

关键词：

ASpIRE Challenge; mismatched conditions; reverberation; distant speech recognition; beamforming;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we introduce the system developed at the Institute for Infocomm Research ((IR)-R-2) for the ASpIRE (Automatic Speech recognition In Reverberant Environments) challenge. The main components of the system are a front-end processing system consisting of a distributed beamforming algorithm, that performs adaptive weighting and channel elimination, a speech dereverberation approach using a maximum-kurtosis criteria, and a robust voice activity detection (VAD) module based on using the sub-harmonic ratio (SHR). The acoustic back-end consists of a multi-conditional Deep Neural Network (DNN) model that uses speaker adapted features combined with a decoding strategy that performs semi-supervised DNN model adaptation using weighted labels generated by the first-pass decoding output. On the single-microphone evaluation, our system achieved a word error rate (WER) of 44.8%. With the incorporation of beamforming on the multi-microphone evaluation, our system achieved an improvement in WER of over 6% to give the best evaluation result of 38.5%.

引用

页码：518 / 524

页数：7

共 19 条

[11] DISCRIMINATIVE TRAINING BASED ON AN INTEGRATED VIEW OF MPE AND MMI IN MARGIN AND ERROR SPACE [J].

McDermott, Erik ;

Watanabe, Shinji ;

Nakamura, Atsushi .

2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, :4894-4897

[12]

Povey Daniel, 2011, IEEE WORKSH AUT SPEE

[13]

Rui Y, 2004, INT CONF ACOUST SPEE, P133

[14]

Sorin A., 2003, EXTENDED ADV FRONT A, V202, P212

[15]

Sun XJ, 2002, INT CONF ACOUST SPEE, P333

[16]

Tomar V., 2010, BLIND DEREVERBERATIO

[17]

Vesely K, 2013, INTERSPEECH, P2344

[18]

Vesely K, 2013, 2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), P267, DOI 10.1109/ASRU.2013.6707741

[19]

Zhang C, 2008, INT CONF ACOUST SPEE, P2565

← 1 2 →