Noise-robust speech feature processing with empirical mode decomposition

被引:0
|
作者
Kuo-Hau Wu
Chia-Ping Chen
Bing-Feng Yeh
机构
[1] National Sun Yat-Sen University,Department of Computer Science and Engineering
来源
EURASIP Journal on Audio, Speech, and Music Processing | / 2011卷
关键词
Speech Signal; Empirical Mode Decomposition; Automatic Speech Recognition; Intrinsic Mode Function; Lower Envelope;
D O I
暂无
中图分类号
学科分类号
摘要
In this article, a novel technique based on the empirical mode decomposition methodology for processing speech features is proposed and investigated. The empirical mode decomposition generalizes the Fourier analysis. It decomposes a signal as the sum of intrinsic mode functions. In this study, we implement an iterative algorithm to find the intrinsic mode functions for any given signal. We design a novel speech feature post-processing method based on the extracted intrinsic mode functions to achieve noise-robustness for automatic speech recognition. Evaluation results on the noisy-digit Aurora 2.0 database show that our method leads to significant performance improvement. The relative improvement over the baseline features increases from 24.0 to 41.1% when the proposed post-processing method is applied on mean-variance normalized speech features. The proposed method also improves over the performance achieved by a very noise-robust frontend when the test speech data are highly mismatched.
引用
收藏
相关论文
共 50 条
  • [1] Noise-robust speech feature processing with empirical mode decomposition
    Wu, Kuo-Hau
    Chen, Chia-Ping
    Yeh, Bing-Feng
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2011, : 1 - 9
  • [2] Empirical Mode Decomposition For Noise-Robust Automatic Speech Recognition
    Wu, Kuo-Hao
    Chen, Chia-Ping
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2074 - 2077
  • [3] A noise robust endpoint detection algorithm for whispered speech based on Empirical Mode Decomposition and entropy
    Tan, Xue-Dan
    Gu, Ji-Hua
    Zhao, He-Ming
    Tao, Zhi
    2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 355 - 359
  • [4] APPLY PIPELINING EMPIRICAL MODE DECOMPOSITION TO ACCELERATE AN EMOTIONALIZED SPEECH PROCESSING
    Chou, Fu-Hua
    Huang, Jie-Cyun
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION, 2009, : 229 - 234
  • [5] An engineering model of the masking for the noise-robust speech recognition
    Park, KY
    Lee, SY
    NEUROCOMPUTING, 2003, 52-4 : 615 - 620
  • [6] Dual-channel VTS feature compensation for noise-robust speech recognition on mobile devices
    Lopez-Espejo, Ivan
    Peinado, Antonio M.
    Gomez, Angel M.
    Gonzalez, Jose A.
    IET SIGNAL PROCESSING, 2017, 11 (01) : 17 - 25
  • [7] Noise-Robust Feature Extraction Based on Forward Masking
    Chiou, Sheng-Chiuan
    Chen, Chia-Ping
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1243 - 1246
  • [8] Applications of empirical mode decomposition for processing nonstationary signals
    Klionski D.M.
    Oreshko N.I.
    Geppener V.V.
    Vasiljev A.V.
    Pattern Recognition and Image Analysis, 2008, 18 (3) : 390 - 399
  • [9] Voiced speech analysis by empirical mode decomposition
    Bouzid, Aicha
    Ellouze, Noureddine
    ADVANCES IN NONLINEAR SPEECH PROCESSING, 2007, 4885 : 213 - +
  • [10] INCORPORATING MASK MODELLING FOR NOISE-ROBUST AUTOMATIC SPEECH RECOGNITION
    Koekueer, Muenevver
    Jancovic, Peter
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3929 - 3932