HMM-SUPERVISED CLASSIFICATION OF THE NMF COMPONENTS FOR ROBUST SPEECH RECOGNITION

被引:0
|
作者
Zheng, Nengheng [1 ]
Li, Xia [1 ]
Cai, Yi [1 ]
机构
[1] Shenzhen Univ, Coll Informat Engn, Shenzhen Key Lab Telecommun & Informat Proc, Shenzhen, Peoples R China
关键词
Speech separation; speech recognition; nonnegative matrix factorization; supervised classification; speech and music;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a nonnegative matrix factorization (NMF)-based source separation algorithm for robust speech recognition with music interference. NMF is applied to decompose the mixture signal into a set of basis vectors and corresponding gain vectors, each belonging to either speech or music. Source separation is achieved via classifying the NMF components, i.e. the basis and the corresponding gain vectors into their respective classes. HMM models are incorporated to supervise the classification. More specifically, the likelihood score output from the Viterbi search, i.e. the probability of the input speech given the recognized word models, is adopted as the classification criterion. Such that the separated speech consists of those NMF components having positive contributions to the Viterbi search score. As a result, the recognition output after the separation processing is mostly confident. Automatic speech recognition experiments demonstrate that the proposed source separation algorithm significantly improve the robustness of the recognition system under music interference.
引用
收藏
页码:83 / 87
页数:5
相关论文
共 50 条
  • [1] HMM-Based Estimation of Unreliable Spectral Components for Noise Robust Speech Recognition
    Borgstroem, Bengt J.
    Alwan, Abeer
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1769 - 1772
  • [2] A new robust hybrid speech recognition algorithm based on FVQ/HMM and neural nets classification
    Asghar, S
    Cong, L
    INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-IV, PROCEEDINGS, 1998, : 1810 - 1816
  • [3] A Novel Visual Speech Representation and HMM Classification for Visual Speech Recognition
    Yu, Dahai
    Ghita, Ovidiu
    Sutherland, Alistair
    Whelan, Paul F.
    ADVANCES IN IMAGE AND VIDEO TECHNOLOGY, PROCEEDINGS, 2009, 5414 : 398 - 409
  • [4] Hybrid HMM/BLSTM-RNN for Robust Speech Recognition
    Sun, Yang
    ten Bosch, Louis
    Boves, Lou
    TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 400 - 407
  • [5] A GMM/HMM model for reconstruction of missing speech spectral components for continuous speech recognition
    Goodarzi M.M.
    Almasganj F.
    International Journal of Speech Technology, 2016, 19 (4) : 769 - 777
  • [6] Contaminated speech training methods for robust DNN-HMM distant speech recognition
    Ravanelli, Mirco
    Omologo, Maurizio
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 756 - 760
  • [7] Robust speech recognition and feature extraction using HMM2
    Weber, K
    Ikbal, S
    Bengio, S
    Bourlard, H
    COMPUTER SPEECH AND LANGUAGE, 2003, 17 (2-3): : 195 - 211
  • [8] Robust Speech Recognition Using Noise-Cluster HMM Interpolation
    Thatphithakkul, Nattanun
    Kruatrachue, Boontee
    Wutiwiwatchi, Chai
    Marukatat, Sanparith
    Boonpiam, Vataya
    ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 596 - +
  • [9] A novel HMM model adaptation and compensation method for robust speech recognition
    Ning, GX
    Wei, G
    INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES 2005, VOLS 1 AND 2, PROCEEDINGS, 2005, : 274 - 277
  • [10] On estimating robust probability distribution in HMM-based speech recognition
    Samsung Advanced Inst of Technology
    IEEE Trans Speech Audio Process, 4 (279-285):