Spectrum enhancement with sparse coding for robust speech recognition

被引:11
|
作者
He, Yongjun [1 ]
Sun, Guanglu [1 ]
Han, Jiqing [2 ]
机构
[1] Harbin Univ Sci & Technol, Harbin 150080, Peoples R China
[2] Harbin Inst Technol, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Sparse coding; Speech denoising; Residual noise; Basis pursuit denoising; JOINT COMPENSATION; REPRESENTATION; NOISE; ADAPTATION; REGRESSION; EQUATIONS; FEATURES; SYSTEMS;
D O I
10.1016/j.dsp.2015.04.014
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recently, a trend in speech recognition is to introduce sparse coding for noise robustness. Although several methods have been proposed, the performance of sparse coding in speech denoising is not so optimistic. One assumption with sparse coding is that the representation of speech over the speech dictionary is sparse, while that of the noise is dense. This assumption is obviously not sustained in the speech denoising scenario. Many noises are also sparse over the speech dictionary. In such a condition, the representation of noisy speech still contains noise components, resulting in degraded performance. To solve this problem, we first analyze the assumption of sparse coding and then propose a novel method to enhance speech spectrum. This method first finds out the atoms which represent the noise sparsely, and then selectively ignores them in the reconstruction of speech to reduce the residual noise. Speech features are then extracted from the enhanced spectrum for speech recognition. Experimental results show that the proposed method can improve the noise robustness of a speech recognition system substantially. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:59 / 70
页数:12
相关论文
共 50 条
  • [41] Robust Speech Recognition via Enhancing the Complex-Valued Acoustic Spectrum in Modulation Domain
    Hung, Jeih-Weih
    Hsieh, Hsin-Ju
    Chen, Berlin
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (02) : 236 - 251
  • [42] Robust sparse coding for subspace learning
    Dai, Xiangguang
    Tao, Yingyin
    Xiong, Jiang
    Feng, Yuming
    ITALIAN JOURNAL OF PURE AND APPLIED MATHEMATICS, 2020, (44): : 986 - 994
  • [43] Locality-constrained Group Sparse Coding Regularized NMR for Robust Face Recognition
    Zhang, Hengmin
    Luo, Wei
    Yang, Jian
    Luo, Lei
    PROCEEDINGS 3RD IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION ACPR 2015, 2015, : 740 - 744
  • [44] SPEECH ENHANCEMENT AND FEATURES COMPENSATION ALGORITHMS FOR CONTINUOUS SPEECH RECOGNITION
    Arcos, Christian
    Grivet, Marco
    Alcaim, Abraham
    2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 27 - 31
  • [45] Fast Approximation for Sparse Coding with Applications to Object Recognition
    Sun, Zhenzhen
    Yu, Yuanlong
    SENSORS, 2021, 21 (04) : 1 - 18
  • [46] Mapping Sparse Representation to State Likelihoods in Noise-Robust Automatic Speech Recognition
    Mahkonen, Katariina
    Hurmalainen, Antti
    Virtanen, Tuomas
    Gemmeke, Jort
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 472 - +
  • [47] ACTION RECOGNITION WITH APPROXIMATE SPARSE CODING
    Wang, Yu
    Kato, Lien
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 770 - 774
  • [48] Discriminative and Compact Coding for Robust Face Recognition
    Lai, Zhao-Rong
    Dai, Dao-Qing
    Ren, Chuan-Xian
    Huang, Ke-Kun
    IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (09) : 1900 - 1912
  • [49] Monaural speech separation based on MAXVQ and CASA for robust speech recognition
    Li, Peng
    Guan, Yong
    Wang, Shijin
    Xu, Bo
    Liu, Wenju
    COMPUTER SPEECH AND LANGUAGE, 2010, 24 (01) : 30 - 44
  • [50] MODIFICATION ON LSA SPEECH ENHANCEMENT FOR SPEECH RECOGNITION
    You, Chang Huai
    Ma, Bin
    Ni, Chongjia
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5475 - 5479