Spectrum enhancement with sparse coding for robust speech recognition

被引：11

作者：

He, Yongjun ^{[1
]}

Sun, Guanglu ^{[1
]}

Han, Jiqing ^{[2
]}

机构：

[1] Harbin Univ Sci & Technol, Harbin 150080, Peoples R China

[2] Harbin Inst Technol, Harbin 150001, Peoples R China

来源：

DIGITAL SIGNAL PROCESSING | 2015年 / 43卷

基金：

中国国家自然科学基金;

关键词：

Sparse coding; Speech denoising; Residual noise; Basis pursuit denoising; JOINT COMPENSATION; REPRESENTATION; NOISE; ADAPTATION; REGRESSION; EQUATIONS; FEATURES; SYSTEMS;

D O I：

10.1016/j.dsp.2015.04.014

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Recently, a trend in speech recognition is to introduce sparse coding for noise robustness. Although several methods have been proposed, the performance of sparse coding in speech denoising is not so optimistic. One assumption with sparse coding is that the representation of speech over the speech dictionary is sparse, while that of the noise is dense. This assumption is obviously not sustained in the speech denoising scenario. Many noises are also sparse over the speech dictionary. In such a condition, the representation of noisy speech still contains noise components, resulting in degraded performance. To solve this problem, we first analyze the assumption of sparse coding and then propose a novel method to enhance speech spectrum. This method first finds out the atoms which represent the noise sparsely, and then selectively ignores them in the reconstruction of speech to reduce the residual noise. Speech features are then extracted from the enhanced spectrum for speech recognition. Experimental results show that the proposed method can improve the noise robustness of a speech recognition system substantially. (C) 2015 Elsevier Inc. All rights reserved.

引用

页码：59 / 70

页数：12

共 50 条

[21] Speech enhancement with a GSC-like structure employing sparse coding
Li-chun YANG
Yun-tao QIAN
Frontiers of Information Technology & Electronic Engineering, 2014, (12) : 1154 - 1163
[22] Sparse Auditory Reproducing Kernel (SPARK) Features for Noise-Robust Speech Recognition
Fazel, Amin
Chakrabartty, Shantanu
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (04): : 1362 - 1371
[23] A Robust Pansharpening Algorithm Based on Convolutional Sparse Coding for Spatial Enhancement
Gogineni, Rajesh
Chaturvedi, Ashvini
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2019, 12 (10) : 4024 - 4037
[24] Language Recognition via Sparse Coding
Gwon, Youngjune L.
Campbell, William M.
Sturim, Douglas
Kung, H. T.
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2920 - 2924
[25] ROBUST FACE RECOGNITION BASED ON ITERATIVE SPARSE CODING AND PIXEL SELECTION
Lian, Lina
Zheng, Huicheng
Dong, Jiayu
2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1782 - 1786
[26] On robust face recognition via sparse coding: the good, the bad and the ugly
Wong, Yongkang
Harandi, Mehrtash T.
Sanderson, Conrad
IET BIOMETRICS, 2014, 3 (04) : 176 - 189
[27] Robust Pedestrian Tracking and Recognition from FLIR Video: A Unified Approach via Sparse Coding
Li, Xin
Guo, Rui
Chen, Chao
SENSORS, 2014, 14 (06) : 11245 - 11259
[28] Universal Regularizers for Robust Sparse Coding and Modeling
Ramirez, Ignacio
Sapiro, Guillermo
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012, 21 (09) : 3850 - 3864
[29] A Novel Speech Emotion Recognition Method via Transfer PCA and Sparse Coding
Song, Peng
Zheng, Wenming
Liu, Jingjing
Li, Jing
Zhang, Xinran
BIOMETRIC RECOGNITION, CCBR 2015, 2015, 9428 : 393 - 400
[30] Sparse coding over redundant dictionaries for fast adaptation of speech recognition system
Shahnawazuddin, S.
Sinha, Rohit
COMPUTER SPEECH AND LANGUAGE, 2017, 43 : 1 - 17

← 1 2 3 4 5 →