Parameter Estimation Procedures for Deep Multi-Frame MVDR Filtering for Single-Microphone Speech Enhancement

被引:1
|
作者
Tammen, Marvin [1 ]
Doclo, Simon
机构
[1] Carl von Ossietzky Univ Oldenburg, Dept Med Phys & Acoust, D-26129 Oldenburg, Germany
关键词
Covariance matrices; Noise measurement; Speech enhancement; Interference; Estimation; Transforms; Filtering algorithms; Matrix structures; multi-frame filtering; MVDR filter; speech enhancement; supervised learning; SUBSPACE APPROACH; NOISE-REDUCTION; SEPARATION; NETWORKS;
D O I
10.1109/TASLP.2023.3306715
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Aiming at exploiting temporal correlations across consecutive time frames in the short-time Fourier transform (STFT) domain, multi-frame algorithms for single-microphone speech enhancement have been proposed. Typically, the multi-frame filter coefficients are either estimated directly using deep neural networks or a certain filter structure is imposed, e.g., the multi-frame minimum variance distortionless response (MFMVDR) filter structure. Recently, it was shown that integrating the fully differentiable MFMVDR filter into an end-to-end supervised learning framework employing temporal convolutional networks (TCNs) allows for a high estimation accuracy of the required parameters, i.e., the speech inter-frame correlation vector and the interference covariance matrix. In this paper, we investigate different covariance matrix structures, namely Hermitian positive-definite, Hermitian positive-definite Toeplitz, and rank-1. The main differences between the considered matrix structures lie in the number of parameters that need to be estimated by the TCNs as well as the required linear algebra operations. For example, assuming a rank-1 matrix structure, we show that the MFMVDR filter can be written as a linear combination of the TCN outputs, significantly reducing computational complexity. In addition, we consider a covariance matrix estimation procedure based on recursive smoothing. Experimental results on the deep noise suppression challenge dataset show that the estimation procedure using the Hermitian positive-definite matrix structure yields the best performance, closely followed by the rank-1 matrix structure at a much lower complexity. Furthermore, imposing the MFMVDR filter structure instead of directly estimating the multi-frame filter coefficients slightly but consistently improves the speech enhancement performance.
引用
收藏
页码:3237 / 3248
页数:12
相关论文
共 31 条
  • [1] DEEP MULTI-FRAME MVDR FILTERING FOR SINGLE-MICROPHONE SPEECH ENHANCEMENT
    Tammen, Marvin
    Doclo, Simon
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8443 - 8447
  • [2] SUBSPACE-BASED SPEECH CORRELATION VECTOR ESTIMATION FOR SINGLE-MICROPHONE MULTI-FRAME MVDR FILTERING
    Fischer, Dorte
    Doclo, Simon
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 856 - 860
  • [3] Sensitivity Analysis of the Multi-Frame MVDR Filter for Single-Microphone Speech Enhancement
    Fischer, Dorte
    Doclo, Simon
    2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 603 - 607
  • [4] Comparison of Parameter Estimation Methods for Single-Microphone Multi-Frame Wiener Filtering
    Fischer, Doerte
    Bruemann, Klaus
    Doclo, Simon
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [5] SINGLE-MICROPHONE SPEECH ENHANCEMENT USING MVDR FILTERING AND WIENER POST-FILTERING
    Fischer, Doerte
    Gerkmann, Timo
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 201 - 205
  • [6] ROBUST MMSE FILTERING FOR SINGLE-MICROPHONE SPEECH ENHANCEMENT
    Enzner, Gerald
    Thuene, Philipp
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4009 - 4013
  • [7] DEEP MULTI-FRAME MVDR FILTERING FOR BINAURAL NOISE REDUCTION
    Tammen, Marvin
    Doclo, Simon
    2022 INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC 2022), 2022,
  • [8] ROBUST CONSTRAINED MFMVDR FILTERING FOR SINGLE-MICROPHONE SPEECH ENHANCEMENT
    Fischer, Doerte
    Doclo, Simon
    2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 41 - 45
  • [9] DNN-BASED SPEECH PRESENCE PROBABILITY ESTIMATION FORMULTI-FRAME SINGLE-MICROPHONE SPEECH ENHANCEMENT
    Tammen, Marvin
    Fischer, Doerte
    Meyer, Bernd T.
    Doclo, Simon
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 191 - 195
  • [10] Two-Stage Single-Channel Speech Enhancement with Multi-Frame Filtering
    Lin, Shaoxiong
    Zhang, Wangyou
    Qian, Yanmin
    APPLIED SCIENCES-BASEL, 2023, 13 (08):