Multi-channel Speech Enhancement Based on the MVDR Beamformer and Postfilter

被引:0
|
作者
Wang, Dujuan [1 ]
Bao, Changchun [1 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing, Peoples R China
来源
2020 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2020) | 2020年
基金
中国国家自然科学基金;
关键词
beamforming; speech enhancement; residual neural network; real and imaginary masks; postfilter;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural network (DNN) based ideal ratio mask (IRM) estimation methods have yielded good performance in monaural speech enhancement. Meanwhile, these methods have also shown considerable potential for beamforming and multichannel speech enhancement. It is crucial for minimum variance distortionless response (MVDR) beamformer to estimate the covariance matrix of the speech and noise accurately. The accurate estimation of time-frequency (T-F) mask has significant impact on the estimation of the covariance matrices. So, in this paper, a complex real and imaginary ratio mask (CRIRM) based MVDR beamformer for speech enhancement using residual network is proposed. First, the real and imaginary masks of speech and noise are estimated by taking advantage of a residual neural network. After that, the estimations of speech and noise are obtained by using the estimated masks. Finally, the covariance matrices of speech and noise are estimated, and applied into the MVDR beamformer. In addition, in order to further reduce residual noise interference, the output of the MVDR beamformer is further processed by an end-to-end monaural speech enhancement module. Experiments show that, the proposed method can better improve the quality and intelligibility of the enhanced speech.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] MULTI-CHANNEL SPEAKER VERIFICATION WITH CONV-TASNET BASED BEAMFORMER
    Mosner, Ladislav
    Plchot, Oldrich
    Burget, Lukas
    Cernocky, Jan ''Honza''
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7982 - 7986
  • [32] Two-stage UNet with channel and temporal-frequency attention for multi-channel speech enhancement
    Xu, Shiyun
    Cao, Yinghan
    Zhang, Zehua
    Wang, Mingjiang
    SPEECH COMMUNICATION, 2025, 166
  • [33] Beamforming and lightweight GRU neural network combination model for multi-channel speech enhancement
    Cao, Zhengdong
    Li, Dongmei
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (8-9) : 5677 - 5683
  • [34] Complex Spectral Mapping for Single- and Multi-Channel Speech Enhancement and Robust ASR
    Wang, Zhong-Qiu
    Wang, Peidong
    Wang, DeLiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1778 - 1787
  • [35] INCORPORATING MULTI-CHANNEL WIENER FILTER WITH SINGLE-CHANNEL SPEECH ENHANCEMENT ALGORITHM
    Yong, Pei Chee
    Nordholm, Sven
    Dam, Hai Huyen
    Leung, Yee Hong
    Lai, Chiong Ching
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7284 - 7288
  • [36] A Multi-Channel Noise Estimator Based on Improved Minima Controlled Recursive Averaging for Speech Enhancement
    Tangsangiumvisai, Nisachon
    ENGINEERING JOURNAL-THAILAND, 2023, 27 (11): : 99 - 112
  • [37] ADL-MVDR: ALL DEEP LEARNING MVDR BEAMFORMER FOR TARGET SPEECH SEPARATION
    Zhang, Zhuohuang
    Xu, Yong
    Yu, Meng
    Zhang, Shi-Xiong
    Chen, Lianwu
    Yu, Dong
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6089 - 6093
  • [38] Postfilter for Dual Channel Speech Enhancement Using Coherence and Statistical Model-Based Noise Estimation
    Cheong, Sein
    Kim, Minseung
    Shin, Jong Won
    SENSORS, 2024, 24 (12)
  • [39] Speech Enhancement Using Improved Generalized Sidelobe Canceller in Frequency Domain with Multi-channel Postfiltering
    Li, Kai
    Fu, Qiang
    Yan, Yonghong
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 973 - 976
  • [40] Multi-Channel Speech Enhancement and Amplitude Modulation Analysis for Noise Robust Automatic Speech Recognition
    Moritz, Niko
    Adiloglu, Kamil
    Anemueller, Joern
    Goetze, Stefan
    Kollmeier, Birger
    COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 558 - 573