Single-Channel Speech Enhancement Based on Adaptive Low-Rank Matrix Decomposition

被引：11

作者：

Li, Chao ^{[1
]}

Jiang, Ting ^{[1
]}

Wu, Sheng ^{[1
]}

Xie, Jianxiao ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Minist Educ, Key Lab Univ Wireless Commun, Beijing 100876, Peoples R China

来源：

IEEE ACCESS | 2020年 / 8卷

基金：

中国国家自然科学基金; 国家自然科学基金重大项目;

关键词：

Maximum correntropy criterion; single-channel speech enhancement; adaptive low-rank matrix decomposition; energy threshold technique; TRUNCATED NUCLEAR NORM; SUBSPACE APPROACH; SPARSE; MINIMIZATION; SIGNAL; NOISE; COMPLETION; POSTFILTER; SEPARATION; SPECTRUM;

D O I：

10.1109/ACCESS.2020.2975069

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The low-rank matrix decomposition (LMD) algorithm based on the maximum correntropy criterion (MCC) has recently shown its superiority to other algorithms in classification (e.g., face recognition), and we develop it into single-channel speech enhancement for the low-rank structure of speech signals in the time domain. However, a new issue has arisen: some residual noise exists in the enhanced speech due to its sensitivity to the exact rank value. To address this issue, we propose a novel adaptive LMD (ALMD) algorithm in which the energy threshold technique is adopted to adaptively update the effective rank value of each frame of the speech matrix. Our proposed ALMD algorithm can achieve an acceptable performance for low signal-to-noise ratio (SNR) levels without approximating the speech phase with the noisy phase. We compare ALMD algorithm with common conventional algorithms in Gaussian white noise and non-Gaussian noise conditions. The simulation results demonstrate that ALMD algorithm can achieve its superiority in terms of the segmental SNR (segSNR), perceptual evaluation of speech Quality (PESQ), and short-time objective intelligibility measure (STOI), when compared with tested baseline algorithms.

引用

页码：37066 / 37076

页数：11

共 47 条

[1] Principal component analysis [J].

Abdi, Herve ;

Williams, Lynne J. .

WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (04) :433-459

[2]

[Anonymous], 2013, PROC 16 INT C ARTIF

[3]

[Anonymous], 2010, arXiv:1009.5055

[4] Speech enhancement based on the subspace method [J].

Asano, F ;

Hayamizu, S ;

Yamada, T ;

Nakamura, S .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (05) :497-507

[5] Time-Frequency Masking Based Online Multi-Channel Speech Enhancement With Convolutional Recurrent Neural Networks [J].

Chakrabarty, Soumitro ;

Habets, Emanuel A. P. .

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2019, 13 (04) :787-799

[6]

Chen Z, 2013, KEY ENG MATER, V567, P1

[7]

Chinaev A, 2017, INT CONF ACOUST SPEE, P4980, DOI 10.1109/ICASSP.2017.7953104

[8] Phase-Aware Single-Channel Speech Enhancement With Modulation-Domain Kalman Filtering [J].

Dionelis, Nikolaos ;

Brookes, Mike .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (05) :937-950

[9] Low rank matrix completion using truncated nuclear norm and sparse regularizer [J].

Dong, Jing ;

Xue, Zhichao ;

Guan, Jian ;

Han, Zi-Fa ;

Wang, Wenwu .

SIGNAL PROCESSING-IMAGE COMMUNICATION, 2018, 68 :76-87

[10] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR [J].

EPHRAIM, Y ;

MALAH, D .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02) :443-445

← 1 2 3 4 5 →