MULTI-CHANNEL SPEECH ENHANCEMENT USING BEAMFORMING AND NULLFORMING FOR SEVERELY ADVERSE DRONE ENVIRONMENT

被引：0

作者：

Kim, Seokhyun ^{[1
]}

Jeong, Won ^{[2
]}

Park, Hyung-Min ^{[1
,2
]}

机构：

[1] Sogang Univ, Dept Elect Engn, Seoul, South Korea

[2] Sogang Univ, Dept Artificial Intelligence, Seoul, South Korea

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024 | 2024年

关键词：

Multi-channel Speech Enhancement; Drone Audition Environments; Neural Beamforming;

D O I：

10.1109/ICASSPW62465.2024.10626577

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we present an end-to-end neural beamforming method for multi-channel speech enhancement in drone environments with severe noise levels. In flying drone environment, recording with microphones attached to the drone creates a situation where the proximity and intensity of propeller and motor noise result in a low signal-to-noise ratio (SNR) compared to the target speech. EaBNet is a Deep Neural Network (DNN) based beamforming that utilizes embedding and beamforming modules to address the inherent un-interpretability in previous end-to-end beamforming models. EaBNet utilizes spatial information for speech enhancement and seeks additional improvement through PostNet. Building upon EaBNet, we incorporate a module inspired by the structure of the Generalized Sidelobe Canceller (GSC) algorithm to estimate the spatial information of ego-noise through nullforming. Additionally, we suggest estimating the spectral features of nullforming estimation outputs in addition to the beamforming output and for the input of the PostNet to achieve higher performance. As a result, it was confirmed that performance improved through these two methods.

引用

页码：755 / 759

页数：5

共 32 条

[1] Factorized MVDR Deep Beamforming for Multi-Channel Speech Enhancement
Kim, Hansol
Kang, Kyeongmuk
Shin, Jong Won
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1898 - 1902
[2] LEARNING-BASED MULTI-CHANNEL SPEECH PRESENCE PROBABILITY ESTIMATION USING A LOW-PARAMETER MODEL AND INTEGRATION WITH MVDR BEAMFORMING FOR MULTI-CHANNEL SPEECH ENHANCEMENT
Tao, Shuai
Mowlaee, Pejman
Jensen, Jesper Rindom
Christensen, Mads Graesboll
2024 18TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT, IWAENC 2024, 2024, : 100 - 104
[3] A Feature Integration Network for Multi-Channel Speech Enhancement
Zeng, Xiao
Zhang, Xue
Wang, Mingjiang
SENSORS, 2024, 24 (22)
[4] A generic neural acoustic beamforming architecture for robust multi-channel speech processing
Heymann, Jahn
Drude, Lukas
Haeb-Umbach, Reinhold
COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 374 - 385
[5] Multi-channel Speech Enhancement Using Time-Domain Convolutional Denoising Autoencoder
Tawara, Naohiro
Kobayashi, Tetsunori
Ogawa, Tetsuji
INTERSPEECH 2019, 2019, : 86 - 90
[6] Eigenvector-Based Speech Mask Estimation for Multi-Channel Speech Enhancement
Pfeifenberger, Lukas
Zoehrer, Matthias
Pernkopf, Franz
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2162 - 2172
[7] A Causal U-net based Neural Beamforming Network for Real-Time Multi-Channel Speech Enhancement
Ren, Xinlei
Zhang, Xu
Chen, Lianwu
Zheng, Xiguang
Zhang, Chen
Guo, Liang
Yu, Bing
INTERSPEECH 2021, 2021, : 1832 - 1836
[8] A time-frequency fusion model for multi-channel speech enhancement
Zeng, Xiao
Xu, Shiyun
Wang, Mingjiang
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01):
[9] Channel-Time-Frequency Attention Module for Improved Multi-Channel Speech Enhancement
Zeng, Xiao
Wang, Mingjiang
IEEE ACCESS, 2025, 13 : 44418 - 44427
[10] A Novel Approach to Multi-Channel Speech Enhancement Based on Graph Neural Networks
Chau, Hoang Ngoc
Bui, Tien Dat
Nguyen, Huu Binh
Duong, Thanh Thi Hien
Nguyen, Quoc Cuong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1133 - 1144

← 1 2 3 4 →