Beamforming-based Speech Enhancement based on Optimal Ratio Mask

被引：0

作者：

Ji, Qiang ^{[1
]}

Bao, Changchun ^{[1
]}

Cheng, Rui ^{[1
]}

机构：

[1] Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China

来源：

CONFERENCE PROCEEDINGS OF 2019 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2019) | 2019年

基金：

中国国家自然科学基金;

关键词：

Speech enhancement; beamforming; time-frequency mask; neural networks; MULTICHANNEL WIENER FILTER;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Speech enhancement in the noisy and reverberant environment remains a challenging task. Acoustic beamforming algorithm with minimum variance distortionless response (MVDR) has shown to be effective for this case. The crucial issue in MVDR-based speech enhancement is to get accurate estimates of the speech and noise spatial covariance matrices (SCMs). On this way, time-frequency mask-based method which is a reliable method to estimate the SCMs can improve the performance of the MVDR beamformer in speech enhancement. In this paper, an optimal ratio mask-based method used for MVDR beamforming is proposed. Specifically, the convolutional neural networks (CNNs) is used in the proposed method, which operates on the magnitude and phase components of the short-time Fourier transform (STFT) of microphones to estimate the optimal ratio masks, and these masks are used to get the SCMs for constructing MVDR beamformer. Experiments are conducted by using simulated data. The results show that the proposed method is more robust than the reference methods against the terrible acoustic conditions.

引用

页数：5

共 25 条

[1] IMAGE METHOD FOR EFFICIENTLY SIMULATING SMALL-ROOM ACOUSTICS [J].