Factorized MVDR Deep Beamforming for Multi-Channel Speech Enhancement

被引：4

作者：

Kim, Hansol ^{[1
]}

Kang, Kyeongmuk ^{[1
]}

Shin, Jong Won ^{[1
]}

机构：

[1] Gwangju Inst Sci & Technol, Sch Elect Engn & Comp Sci, Gwangju 61005, South Korea

来源：

IEEE SIGNAL PROCESSING LETTERS | 2022年 / 29卷

基金：

新加坡国家研究基金会;

关键词：

Speech enhancement; Estimation; Artificial neural networks; MISO communication; Array signal processing; Deep learning; Microphone arrays; Multi-channel speech enhancement; deep learning-based beamforming; factorized MVDR beamformer; NEURAL-NETWORK; SEPARATION; ATTENTION;

D O I：

10.1109/LSP.2022.3200581

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Traditionally, adaptive beamformers such as the minimum-variance distortionless response (MVDR) beamformer and generalized eigenvalue beamformer have been widely used for multi-channel speech enhancement with a single-channel postfilter. Recently, several approaches have been proposed to enhance the signals used to estimate speech and noise spatial covariance matrices (SCMs) and process the outputs of the beamformers using deep neural networks (DNNs). However, the preprocessing of the signals for SCMs estimation may disrupt phase relations among input signals and the time-averages used to estimate speech and noise SCMs may not be optimal for beamformer performance even though the estimated signals are close to the ground truth. In this letter, we propose a deep beamforming approach which estimates factors of the MVDR beamformer using a DNN to circumvent the difficulty of the speech and noise SCM estimation. We formulate the MVDR beamformer as a factorized form related to two complex factors and estimate them using a DNN with a cost function comparing beamformed signal and the original clean speech. Experimental results showed that the proposed factorized MVDR beamformer could mimic the characteristics of the MVDR beamformer with true relative transfer function and noise SCM and outperformed the MVDR beamformer with deep learning-based pre- and post-processing in terms of the perceptual evaluation of speech quality scores.

引用

页码：1898 / 1902

页数：5

共 50 条

[21] Insights Into Deep Non-Linear Filters for Improved Multi-Channel Speech Enhancement
Tesch, Kristina
Gerkmann, Timo
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 563 - 575
[22] A time-frequency fusion model for multi-channel speech enhancement
Zeng, Xiao
Xu, Shiyun
Wang, Mingjiang
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01):
[23] Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement
Taherian, Hassan
Wang, Zhong-Qiu
Chang, Jorge
Wang, DeLiang
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1293 - 1302
[24] Speech Enhancement Integrating the MVDR Beamforming and T-F Masking
Zhu, Jinru
Bao, Changchun
Cheng, Rui
CONFERENCE PROCEEDINGS OF 2019 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2019), 2019,
[25] A Neural Beamspace-Domain Filter for Real-Time Multi-Channel Speech Enhancement
Liu, Wenzhe
Li, Andong
Wang, Xiao
Yuan, Minmin
Chen, Yi
Zheng, Chengshi
Li, Xiaodong
SYMMETRY-BASEL, 2022, 14 (06):
[26] Unsupervised Improved MVDR Beamforming for Sound Enhancement
Kealey, Jacob
Hershey, John R.
Grondin, Francois
INTERSPEECH 2024, 2024, : 2175 - 2179
[27] EXPLORING MULTI-CHANNEL FEATURES FOR DENOISING-AUTOENCODER-BASED SPEECH ENHANCEMENT
Araki, Shoko
Hayashi, Tomoki
Delcroix, Marc
Fujimoto, Masakiyo
Takeda, Kazuya
Nakatani, Tomohiro
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 116 - 120
[28] MULTI-CHANNEL SPEECH ENHANCEMENT BASED ON INDEPENDENT VECTOR EXTRACTION
Cmejla, Jaroslav
Koldovsky, Zbynek
2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 525 - 529
[29] A generic neural acoustic beamforming architecture for robust multi-channel speech processing
Heymann, Jahn
Drude, Lukas
Haeb-Umbach, Reinhold
COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 374 - 385
[30] Correntropy-Based Multi-objective Multi-channel Speech Enhancement
Cui, Xingyue
Chen, Zhe
Yin, Fuliang
Xu, Xianfa
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (09) : 4998 - 5025

← 1 2 3 4 5 →