MULTI-CHANNEL SPEECH ENHANCEMENT USING BEAMFORMING AND NULLFORMING FOR SEVERELY ADVERSE DRONE ENVIRONMENT

被引:0
作者
Kim, Seokhyun [1 ]
Jeong, Won [2 ]
Park, Hyung-Min [1 ,2 ]
机构
[1] Sogang Univ, Dept Elect Engn, Seoul, South Korea
[2] Sogang Univ, Dept Artificial Intelligence, Seoul, South Korea
来源
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024 | 2024年
关键词
Multi-channel Speech Enhancement; Drone Audition Environments; Neural Beamforming;
D O I
10.1109/ICASSPW62465.2024.10626577
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we present an end-to-end neural beamforming method for multi-channel speech enhancement in drone environments with severe noise levels. In flying drone environment, recording with microphones attached to the drone creates a situation where the proximity and intensity of propeller and motor noise result in a low signal-to-noise ratio (SNR) compared to the target speech. EaBNet is a Deep Neural Network (DNN) based beamforming that utilizes embedding and beamforming modules to address the inherent un-interpretability in previous end-to-end beamforming models. EaBNet utilizes spatial information for speech enhancement and seeks additional improvement through PostNet. Building upon EaBNet, we incorporate a module inspired by the structure of the Generalized Sidelobe Canceller (GSC) algorithm to estimate the spatial information of ego-noise through nullforming. Additionally, we suggest estimating the spectral features of nullforming estimation outputs in addition to the beamforming output and for the input of the PostNet to achieve higher performance. As a result, it was confirmed that performance improved through these two methods.
引用
收藏
页码:755 / 759
页数:5
相关论文
共 32 条
  • [1] Factorized MVDR Deep Beamforming for Multi-Channel Speech Enhancement
    Kim, Hansol
    Kang, Kyeongmuk
    Shin, Jong Won
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1898 - 1902
  • [2] LEARNING-BASED MULTI-CHANNEL SPEECH PRESENCE PROBABILITY ESTIMATION USING A LOW-PARAMETER MODEL AND INTEGRATION WITH MVDR BEAMFORMING FOR MULTI-CHANNEL SPEECH ENHANCEMENT
    Tao, Shuai
    Mowlaee, Pejman
    Jensen, Jesper Rindom
    Christensen, Mads Graesboll
    2024 18TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT, IWAENC 2024, 2024, : 100 - 104
  • [3] A Feature Integration Network for Multi-Channel Speech Enhancement
    Zeng, Xiao
    Zhang, Xue
    Wang, Mingjiang
    SENSORS, 2024, 24 (22)
  • [4] A generic neural acoustic beamforming architecture for robust multi-channel speech processing
    Heymann, Jahn
    Drude, Lukas
    Haeb-Umbach, Reinhold
    COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 374 - 385
  • [5] Multi-channel Speech Enhancement Using Time-Domain Convolutional Denoising Autoencoder
    Tawara, Naohiro
    Kobayashi, Tetsunori
    Ogawa, Tetsuji
    INTERSPEECH 2019, 2019, : 86 - 90
  • [6] Eigenvector-Based Speech Mask Estimation for Multi-Channel Speech Enhancement
    Pfeifenberger, Lukas
    Zoehrer, Matthias
    Pernkopf, Franz
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2162 - 2172
  • [7] A Causal U-net based Neural Beamforming Network for Real-Time Multi-Channel Speech Enhancement
    Ren, Xinlei
    Zhang, Xu
    Chen, Lianwu
    Zheng, Xiguang
    Zhang, Chen
    Guo, Liang
    Yu, Bing
    INTERSPEECH 2021, 2021, : 1832 - 1836
  • [8] A time-frequency fusion model for multi-channel speech enhancement
    Zeng, Xiao
    Xu, Shiyun
    Wang, Mingjiang
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01):
  • [9] Channel-Time-Frequency Attention Module for Improved Multi-Channel Speech Enhancement
    Zeng, Xiao
    Wang, Mingjiang
    IEEE ACCESS, 2025, 13 : 44418 - 44427
  • [10] A Novel Approach to Multi-Channel Speech Enhancement Based on Graph Neural Networks
    Chau, Hoang Ngoc
    Bui, Tien Dat
    Nguyen, Huu Binh
    Duong, Thanh Thi Hien
    Nguyen, Quoc Cuong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1133 - 1144