Frequency Selection Based Separation of Speech Signals with Reduced Computational Time Using Sparse NMF

被引：7

作者：

Varshney, Yash Vardhan ^{[1
]}

Abbasi, Zia Ahmad ^{[1
]}

Abidi, Musiur Raza ^{[1
]}

Farooq, Omar ^{[1
]}

机构：

[1] Aligarh Muslim Univ, Dept Elect, Aligarh, Uttar Pradesh, India

来源：

ARCHIVES OF ACOUSTICS | 2017年 / 42卷 / 02期

关键词：

sparse NMF; mixed speech recognition; machine learning; NONNEGATIVE MATRIX FACTORIZATION;

D O I：

10.1515/aoa-2017-0031

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Application of wavelet decomposition is described to speed up the mixed speech signal separation with the help of non-negative matrix factorisation (NMF). It is assumed that the basis vectors of training data of individual speakers had been recorded. In this paper, the spectrogram magnitude of a mixed signal has been factorised with the help of NMF with consideration of sparseness of speech signals. The high frequency components of signal contain very small amount of signal energy. By rejecting the high frequency components, the size of input signal is reduced, which reduces the computational time of matrix factorisation. The signal of lower energy has been separated by using wavelet decomposition. The present work is done for wideband microphone speech signal and standard audio signal from digital video equipment. This shows an improvement in the separation capability using the proposed model as compared with an existing one in terms of correlation between separated and original signals. Obtained signal to distortion ratio (SDR) and signal to interference ratio (SIR) are also larger as compare of the existing model. The proposed model also shows a reduction in computational time, which results in faster operation.

引用

页码：287 / 295

页数：9

共 19 条

[1]

[Anonymous], 2006, 9 INT C SPOK LANG PR

[2]

[Anonymous], 2006, P IEEE INT C AC SPEE

[3]

Cho YC, 2003, PROCEEDINGS OF THE 3RD IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, P633

[4] Single-Channel Speech-Music Separation for Robust ASR With Mixture Models [J].

Demir, Cemil ;

Saraclar, Murat ;

Cemgil, Ali Taylan .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (04) :725-736

[5] Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis [J].

Fevotte, Cedric ;

Bertin, Nancy ;

Durrieu, Jean-Louis .

NEURAL COMPUTATION, 2009, 21 (03) :793-830

[6]

Hoyer PO, 2004, J MACH LEARN RES, V5, P1457

[7]

KIM J., 2008, GTCSE0801 GEORG I TE

[8] Learning the parts of objects by non-negative matrix factorization [J].

Lee, DD ;

Seung, HS .

NATURE, 1999, 401 (6755) :788-791

[9]

Lee DD, 2001, ADV NEUR IN, V13, P556

[10]

Nasersharif B, 2015, 2015 INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), P107, DOI 10.1109/AISP.2015.7123491

← 1 2 →