A novel speech enhancement method based on constrained low-rank and sparse matrix decomposition

被引:36
作者
Sun, Chengli [1 ,2 ]
Zhu, Qi [3 ]
Wan, Minghua [2 ]
机构
[1] Sci & Technol Avion Integrat Lab, Shanghai 200233, Peoples R China
[2] Nanchang Hang kong Univ, Sch Informat, Nanchang 330063, Peoples R China
[3] Nanjing Univ Aeronaut & Astronaut, Dept Comp Sci & Engn, Nanjing 210016, Jiangsu, Peoples R China
关键词
Speech enhancement; Matrix decomposition; Low-rank matrix approximation; Robust principal component analysis; SUBSPACE APPROACH; NOISE; ALGORITHM;
D O I
10.1016/j.specom.2014.03.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we present a novel speech enhancement method based on the principle of constrained low-rank and sparse matrix decomposition (CLSMD). According to the proposed method, noise signal can be assumed as a low-rank component because noise spectra within different time frames are usually highly correlated with each other; while the speech signal is regarded as a sparse component since it is relatively sparse in time frequency domain. Based on these assumptions, we develop an alternative projection algorithm to separate the speech and noise magnitude spectra by imposing rank and sparsity constraints, with which the enhanced time-domain speech can be constructed from sparse matrix by inverse discrete Fourier transform and overlap-add-synthesis. The proposed method is significantly different from existing speech enhancement methods. It can estimate enhanced speech in a straightforward manner, and does not need a voice activity detector to find noise-only excerpts for noise estimation. Moreover, it can obtain better performance in low SNR conditions, and does not need to know the exact distribution of noise signal. Experimental results show the new method can perform better than conventional methods in many types of strong noise conditions, in terms of yielding less residual noise and lower speech distortion. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:44 / 55
页数:12
相关论文
共 40 条
[1]  
[Anonymous], 2002, Principal components analysis
[2]  
[Anonymous], PERC EV SPEECH QUAL
[3]  
[Anonymous], P INT C SPOK LANG PR
[4]  
[Anonymous], 2009, ADV NEURAL INFORM PR
[5]  
[Anonymous], 2001, Discrete-Time Speech Signal Processing:Principles and Practice
[6]  
[Anonymous], IEEE T PATTERN ANAL
[7]  
[Anonymous], 2011, J CHINA U POSTS TELE
[8]  
[Anonymous], ICASSP
[9]  
[Anonymous], 2007, Speech Enhancement: Theory and Practice
[10]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120