Noise-robust speech feature processing with empirical mode decomposition

被引：0

作者：

Kuo-Hau Wu

Chia-Ping Chen

Bing-Feng Yeh

机构：

[1] National Sun Yat-Sen University,Department of Computer Science and Engineering

来源：

EURASIP Journal on Audio, Speech, and Music Processing | / 2011卷

关键词：

Speech Signal; Empirical Mode Decomposition; Automatic Speech Recognition; Intrinsic Mode Function; Lower Envelope;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In this article, a novel technique based on the empirical mode decomposition methodology for processing speech features is proposed and investigated. The empirical mode decomposition generalizes the Fourier analysis. It decomposes a signal as the sum of intrinsic mode functions. In this study, we implement an iterative algorithm to find the intrinsic mode functions for any given signal. We design a novel speech feature post-processing method based on the extracted intrinsic mode functions to achieve noise-robustness for automatic speech recognition. Evaluation results on the noisy-digit Aurora 2.0 database show that our method leads to significant performance improvement. The relative improvement over the baseline features increases from 24.0 to 41.1% when the proposed post-processing method is applied on mean-variance normalized speech features. The proposed method also improves over the performance achieved by a very noise-robust frontend when the test speech data are highly mismatched.

引用

共 50 条

[1] Noise-robust speech feature processing with empirical mode decomposition
Wu, Kuo-Hau
Chen, Chia-Ping
Yeh, Bing-Feng
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2011, : 1 - 9
[2] Empirical Mode Decomposition For Noise-Robust Automatic Speech Recognition
Wu, Kuo-Hao
Chen, Chia-Ping
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2074 - 2077
[3] A noise robust endpoint detection algorithm for whispered speech based on Empirical Mode Decomposition and entropy
Tan, Xue-Dan
Gu, Ji-Hua
Zhao, He-Ming
Tao, Zhi
2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 355 - 359
[4] APPLY PIPELINING EMPIRICAL MODE DECOMPOSITION TO ACCELERATE AN EMOTIONALIZED SPEECH PROCESSING
Chou, Fu-Hua
Huang, Jie-Cyun
PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION, 2009, : 229 - 234
[5] An engineering model of the masking for the noise-robust speech recognition
Park, KY
Lee, SY
NEUROCOMPUTING, 2003, 52-4 : 615 - 620
[6] Dual-channel VTS feature compensation for noise-robust speech recognition on mobile devices
Lopez-Espejo, Ivan
Peinado, Antonio M.
Gomez, Angel M.
Gonzalez, Jose A.
IET SIGNAL PROCESSING, 2017, 11 (01) : 17 - 25
[7] Noise-Robust Feature Extraction Based on Forward Masking
Chiou, Sheng-Chiuan
Chen, Chia-Ping
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1243 - 1246
[8] Applications of empirical mode decomposition for processing nonstationary signals
Klionski D.M.
Oreshko N.I.
Geppener V.V.
Vasiljev A.V.
Pattern Recognition and Image Analysis, 2008, 18 (3) : 390 - 399
[9] Voiced speech analysis by empirical mode decomposition
Bouzid, Aicha
Ellouze, Noureddine
ADVANCES IN NONLINEAR SPEECH PROCESSING, 2007, 4885 : 213 - +
[10] INCORPORATING MASK MODELLING FOR NOISE-ROBUST AUTOMATIC SPEECH RECOGNITION
Koekueer, Muenevver
Jancovic, Peter
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3929 - 3932

← 1 2 3 4 5 →