EARLY FUSION OF SPARSE CLASSIFICATION AND GMM FOR NOISE ROBUST ASR

被引:0
作者
Sun, Yang [1 ]
Gemmeke, Jort F. [1 ]
Cranen, Bert [1 ]
ten Bosch, Louis [1 ]
Boves, Lou [1 ]
机构
[1] Radboud Univ Nijmegen, Ctr Language & Speech Technol, NL-6525 ED Nijmegen, Netherlands
来源
19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011) | 2011年
关键词
SPEECH RECOGNITION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In previous work we have shown that an ASR system consisting of a dual-input DBN which simultaneously observes MFCC acoustic features and predicted phone labels that are generated by an exemplar-based Sparse Classification (SC) system can achieve better word recognition accuracies in noise than a system observing only one of those input streams. This paper explores two modifications of the SC input to further improve the noise robustness of the dual-input DBN system: 1) integrating more time context and 2) using N best states. Experiments on AURORA-2 reveal that the first approach significantly improves the recognition results at almost all SNRs, but particularly in the more noisy conditions, achieving up to 6.1% (absolute) accuracy gain at SNR -5 dB. The second modification shows that there is an optimal N which allows the maximum attainable accuracy to be even further improved with another 11.8% at -5 dB.
引用
收藏
页码:1495 / 1499
页数:5
相关论文
共 26 条
  • [21] Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
    Hu, Yuchen
    Chen, Chen
    Zhu, Qiushi
    Chng, Eng Siong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1145 - 1156
  • [22] Fusion Feature Extraction Based on Auditory and Energy for Noise-Robust Speech Recognition
    Shi, Yanyan
    Bai, Jing
    Xue, Peiyun
    Shi, Dianxi
    IEEE ACCESS, 2019, 7 : 81911 - 81922
  • [23] Noise-Robust Deep Learning Model for Emotion Classification Using Facial Expressions
    Oh, Seungjun
    Kim, Dong-Keun
    IEEE ACCESS, 2024, 12 : 143074 - 143089
  • [24] Noise robust phonetic classification with linear regularized least squares and second-order features
    Rifkin, Ryan
    Schutte, Ken
    Saad, Michelle
    Bouvrie, Jake
    Glass, Jim
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 881 - +
  • [25] Combined Features and Kernel Design for Noise Robust Phoneme Classification Using Support Vector Machines
    Yousafzai, Jibran
    Sollich, Peter
    Cvetkovic, Zoran
    Yu, Bin
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (05): : 1396 - 1407
  • [26] Noise-robust Hidden Markov Models for limited training data for within-species bird phrase classification
    Kaewtip, Kantapon
    Taylor, Charles
    Alwan, Abeer
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2587 - 2591