Using a DBN to integrate Sparse Classification and GMM-based ASR

被引:0
|
作者
Sun, Yang [1 ]
Gemmeke, Jort F. [1 ]
Cranen, Bert [1 ]
ten Bosch, Louis [1 ]
Boves, Lou [1 ]
机构
[1] Radboud Univ Nijmegen, Ctr Language & Speech Technol, Nijmegen, Netherlands
来源
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年
关键词
noise robustness; speech recognition; dynamic bayesian network; sparse classification; SPEECH RECOGNITION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The performance of an HMM-based speech recognizer using MFCCs as input is known to degrade dramatically in noisy conditions. Recently, an exemplar-based noise robust ASR approach, sparse classification (SC) was introduced. While very successful at lower SNRs, the performance at high SNRs suffered when compared to HMM-based systems. In this work, we propose to use a Dynamic Bayesian Network (DBN) to implement an HMM-model that uses both MFCCs and phone predictions extracted from the SC system as input. By doing experiments on the AURORA-2 connected digit recognition task, we show that our approach successfully combines the strengths of both systems, resulting in competitive recognition accuracies at both high and low SNRs.
引用
收藏
页码:2098 / 2101
页数:4
相关论文
共 50 条
  • [41] GMM-based GNSS spoofing detector using double differential phase measurement
    Vinh, La The
    Nguyen, Van Hien
    Van, Hiep Hoang
    Dinh, Thuan Nguyen
    Hung, Pham Ngoc
    Ta, Tung Hai
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (04)
  • [42] Robust Pedestrian Tracking in Crowd Scenarios Using an Adaptive GMM-based Framework
    Zhang, Shuyang
    Wang, Di
    Ma, Fulong
    Qin, Chao
    Chen, Zhengyong
    Liu, Ming
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 9992 - 9998
  • [43] EXPLORING MUTUAL INFORMATION FOR GMM-BASED SPECTRAL CONVERSION
    Hwang, Hsin-Te
    Tsao, Yu
    Wang, Hsin-Min
    Wang, Yih-Ru
    Chen, Sin-Horng
    2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 50 - 54
  • [44] A GMM-based Probabilistic Sequence Kernel for Speaker Verification
    Lee, Kong-Aik
    You, Changhuai
    Li, Haizhou
    Kinnunen, Tomi
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1553 - 1556
  • [45] GMM-Based Gender Identification Employing Group Delay
    Lee, Kye-Hwan
    Chang, Joon-Hyuk
    Lim, Woohyung
    Kim, Nam Soo
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2007, 26 (06): : 243 - 249
  • [46] A Study of Mutual Information for GMM-Based Spectral Conversion
    Hwang, Hsin-Te
    Tsao, Yu
    Wang, Hsin-Min
    Wang, Yih-Ru
    Chen, Sin-Horng
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 78 - 81
  • [47] Speaker and session variability in GMM-based speaker verification
    Kenny, Patrick
    Boulianne, Gilles
    Ouellet, Pierre
    Dumouchel, Pierre
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (04): : 1448 - 1460
  • [48] GMM-based localization algorithm under NLOS conditions
    Cui, Wei
    Wu, Cheng-Dong
    Zhang, Yun-Zhou
    Jia, Zi-Xi
    Cheng, Long
    Tongxin Xuebao/Journal on Communications, 2014, 35 (01): : 99 - 106
  • [49] G-SAM: GMM-based segment anything model for medical image classification and segmentation
    Liu, Xiaoxiao
    Zhao, Yan
    Wang, Shigang
    Wei, Jian
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (10): : 14231 - 14245
  • [50] A Fast Noise Resilient Anomaly Detection using GMM-Based Collective Labelling
    Bigdeli, Elnaz
    Mohammadi, Mahdi
    Raahemi, Bijan
    Matwin, Stan
    2015 SCIENCE AND INFORMATION CONFERENCE (SAI), 2015, : 337 - 344