EARLY FUSION OF SPARSE CLASSIFICATION AND GMM FOR NOISE ROBUST ASR

被引：0

作者：

Sun, Yang ^{[1
]}

Gemmeke, Jort F. ^{[1
]}

Cranen, Bert ^{[1
]}

ten Bosch, Louis ^{[1
]}

Boves, Lou ^{[1
]}

机构：

[1] Radboud Univ Nijmegen, Ctr Language & Speech Technol, NL-6525 ED Nijmegen, Netherlands

来源：

19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011) | 2011年

关键词：

SPEECH RECOGNITION;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In previous work we have shown that an ASR system consisting of a dual-input DBN which simultaneously observes MFCC acoustic features and predicted phone labels that are generated by an exemplar-based Sparse Classification (SC) system can achieve better word recognition accuracies in noise than a system observing only one of those input streams. This paper explores two modifications of the SC input to further improve the noise robustness of the dual-input DBN system: 1) integrating more time context and 2) using N best states. Experiments on AURORA-2 reveal that the first approach significantly improves the recognition results at almost all SNRs, but particularly in the more noisy conditions, achieving up to 6.1% (absolute) accuracy gain at SNR -5 dB. The second modification shows that there is an optimal N which allows the maximum attainable accuracy to be even further improved with another 11.8% at -5 dB.

引用

页码：1495 / 1499

页数：5