Single-Channel Speech Enhancement Based on Adaptive Low-Rank Matrix Decomposition

被引：11

作者：

Li, Chao ^{[1
]}

Jiang, Ting ^{[1
]}

Wu, Sheng ^{[1
]}

Xie, Jianxiao ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Minist Educ, Key Lab Univ Wireless Commun, Beijing 100876, Peoples R China

来源：

IEEE ACCESS | 2020年 / 8卷

基金：

中国国家自然科学基金; 国家自然科学基金重大项目;

关键词：

Maximum correntropy criterion; single-channel speech enhancement; adaptive low-rank matrix decomposition; energy threshold technique; TRUNCATED NUCLEAR NORM; SUBSPACE APPROACH; SPARSE; MINIMIZATION; SIGNAL; NOISE; COMPLETION; POSTFILTER; SEPARATION; SPECTRUM;

D O I：

10.1109/ACCESS.2020.2975069

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The low-rank matrix decomposition (LMD) algorithm based on the maximum correntropy criterion (MCC) has recently shown its superiority to other algorithms in classification (e.g., face recognition), and we develop it into single-channel speech enhancement for the low-rank structure of speech signals in the time domain. However, a new issue has arisen: some residual noise exists in the enhanced speech due to its sensitivity to the exact rank value. To address this issue, we propose a novel adaptive LMD (ALMD) algorithm in which the energy threshold technique is adopted to adaptively update the effective rank value of each frame of the speech matrix. Our proposed ALMD algorithm can achieve an acceptable performance for low signal-to-noise ratio (SNR) levels without approximating the speech phase with the noisy phase. We compare ALMD algorithm with common conventional algorithms in Gaussian white noise and non-Gaussian noise conditions. The simulation results demonstrate that ALMD algorithm can achieve its superiority in terms of the segmental SNR (segSNR), perceptual evaluation of speech Quality (PESQ), and short-time objective intelligibility measure (STOI), when compared with tested baseline algorithms.

引用

页码：37066 / 37076

页数：11

共 47 条

[21] Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function [J].

Li, Xiaofei ;

Girin, Laurent ;

Gannot, Sharon ;

Horaud, Radu .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (03) :645-659

[22] Distributed-microphones based in-vehicle speech enhancement via sparse and low-rank spectrogram decomposition [J].

Li, Xuliang ;

Fan, Miao ;

Liu, Liyang ;

Li, Weifeng .

SPEECH COMMUNICATION, 2018, 98 :51-62

[23] Correntropy: properties and applications in non-gaussian signal processing [J].

Liu, Weifeng ;

Pokharel, Puskal P. ;

Principe, Jose C. .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2007, 55 (11) :5286-5298

[24]

Loizou P. C., 2013, COMPUT REV, V2nd

[25]

Low S. Y., 2017, P INTER NOISE NOISE, P6186

[26] Compressive speech enhancement [J].

Low, Siow Yong ;

Duc Son Pham ;

Venkatesh, Svetha .

SPEECH COMMUNICATION, 2013, 55 (06) :757-768

[27] Enhancement of single channel speech using perceptual-decision-directed approach [J].

Lu, Ching-Ta .

SPEECH COMMUNICATION, 2011, 53 (04) :495-507

[28] Estimators of the Magnitude-Squared Spectrum and Methods for Incorporating SNR Uncertainty [J].

Lu, Yang ;

Loizou, Philipos C. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (05) :1123-1137

[29] Combined LCMV-TRINICON Beamforming for Separating Multiple Speech Sources in Noisy and Reverberant Environments [J].

Markovich-Golan, Shmulik ;

Gannot, Sharon ;

Kellermann, Walter .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (02) :320-332

[30] A novel speech enhancement method by learnable sparse and low-rank decomposition and domain adaptation [J].

Mavaddaty, Samira ;

Ahadi, Seyed Mohammad ;

Seyedin, Sanaz .

SPEECH COMMUNICATION, 2016, 76 :42-60

← 1 2 3 4 5 →