EARLY FUSION OF SPARSE CLASSIFICATION AND GMM FOR NOISE ROBUST ASR

被引：0

作者：

Sun, Yang ^{[1
]}

Gemmeke, Jort F. ^{[1
]}

Cranen, Bert ^{[1
]}

ten Bosch, Louis ^{[1
]}

Boves, Lou ^{[1
]}

机构：

[1] Radboud Univ Nijmegen, Ctr Language & Speech Technol, NL-6525 ED Nijmegen, Netherlands

来源：

19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011) | 2011年

关键词：

SPEECH RECOGNITION;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In previous work we have shown that an ASR system consisting of a dual-input DBN which simultaneously observes MFCC acoustic features and predicted phone labels that are generated by an exemplar-based Sparse Classification (SC) system can achieve better word recognition accuracies in noise than a system observing only one of those input streams. This paper explores two modifications of the SC input to further improve the noise robustness of the dual-input DBN system: 1) integrating more time context and 2) using N best states. Experiments on AURORA-2 reveal that the first approach significantly improves the recognition results at almost all SNRs, but particularly in the more noisy conditions, achieving up to 6.1% (absolute) accuracy gain at SNR -5 dB. The second modification shows that there is an optimal N which allows the maximum attainable accuracy to be even further improved with another 11.8% at -5 dB.

引用

页码：1495 / 1499

页数：5

共 26 条

[21] Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
Hu, Yuchen
Chen, Chen
Zhu, Qiushi
Chng, Eng Siong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1145 - 1156
[22] Fusion Feature Extraction Based on Auditory and Energy for Noise-Robust Speech Recognition
Shi, Yanyan
Bai, Jing
Xue, Peiyun
Shi, Dianxi
IEEE ACCESS, 2019, 7 : 81911 - 81922
[23] Noise-Robust Deep Learning Model for Emotion Classification Using Facial Expressions
Oh, Seungjun
Kim, Dong-Keun
IEEE ACCESS, 2024, 12 : 143074 - 143089
[24] Noise robust phonetic classification with linear regularized least squares and second-order features
Rifkin, Ryan
Schutte, Ken
Saad, Michelle
Bouvrie, Jake
Glass, Jim
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 881 - +
[25] Combined Features and Kernel Design for Noise Robust Phoneme Classification Using Support Vector Machines
Yousafzai, Jibran
Sollich, Peter
Cvetkovic, Zoran
Yu, Bin
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (05): : 1396 - 1407
[26] Noise-robust Hidden Markov Models for limited training data for within-species bird phrase classification
Kaewtip, Kantapon
Taylor, Charles
Alwan, Abeer
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2587 - 2591

← 1 2 3 →