Speech Enhancement via Low-rank Matrix Decomposition and Image Based Masking

被引：0

作者：

Liu, Liyang ^{[1
]}

Ding, Zhaogui ^{[1
]}

Li, Weifeng ^{[1
]}

Wang, Longbiao ^{[2
]}

Liao, Qingmin ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Elect Engn, Grad Sch Shenzhen, Beijing, Peoples R China

[2] Nagaoka Univ Technol, Nagaoka, Niigata, Japan

来源：

2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2014年

关键词：

speech enhancement; spectrogram decomposition; sparse; low-rank; RPCA;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speech enhancement is an important task in many applications such as speech recognition. Conventional methods always require some principles by which to distinguish speech and noise and the most successful enhancement requires strong models for both speech and noise. However, if the noise actually encountered differs significantly from the system's assumptions, performance will rapidly declines. In this work, we propose an unsupervised speech enhancement system based on decomposing the frequency-time spectrogram into a sparse foreground speech and a low-rank background noise, which makes few assumptions about the noise other than its limited spectral variation. An image based masking is also designed to handle the poor performance of noise removing when using spectrogram decomposition only. Evaluations via PESQ and SegSNR show that the new approach improves signal-to-distortion ratio and PESQ in most cases when compared to several traditional speech enhancement algorithms.

引用

页码：389 / +

页数：2

共 11 条

[1] A Speech Enhancement Algorithm Based on a Chi MRF Model of the Speech STFT Amplitudes [J].

Andrianakis, Yiannis ;

White, Paul R. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (08) :1508-1517

[2]

[Anonymous], 2013, COMPUT REV

[3]

Boll S., 1979, ACOUSTICS SPEECH SIG, V27, P113, DOI DOI 10.1109/TASSP.1979.1163209

[4] Robust Principal Component Analysis? [J].

Candes, Emmanuel J. ;

Li, Xiaodong ;

Ma, Yi ;

Wright, John .

JOURNAL OF THE ACM, 2011, 58 (03)

[5] Model-Based Speech Enhancement With Improved Spectral Envelope Estimation via Dynamics Tracking [J].

Chen, Ruofei ;

Chan, Cheung-Fat ;

So, Hing Cheung .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (04) :1324-1336

[6] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR [J].

EPHRAIM, Y ;

MALAH, D .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02) :443-445

[7] A SIGNAL SUBSPACE APPROACH FOR SPEECH ENHANCEMENT [J].

EPHRAIM, Y ;

VANTREES, HL .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (04) :251-266

[8] Speech Enhancement, Gain, and Noise Spectrum Adaptation Using Approximate Bayesian Estimation [J].

Hao, Jiucang ;

Attias, Hagai ;

Nagarajan, Srikantan ;

Lee, Te-Won ;

Sejnowski, Terrence J. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (01) :24-37

[9]

Huang PS, 2012, INT CONF ACOUST SPEE, P57, DOI 10.1109/ICASSP.2012.6287816

[10] SPEECH ENHANCEMENT USING A SOFT-DECISION NOISE SUPPRESSION FILTER [J].

MCAULAY, RJ ;

MALPASS, ML .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (02) :137-145

← 1 2 →