MULTI-CHANNEL SPEECH ENHANCEMENT USING BEAMFORMING AND NULLFORMING FOR SEVERELY ADVERSE DRONE ENVIRONMENT

被引：0

作者：

Kim, Seokhyun ^{[1
]}

Jeong, Won ^{[2
]}

Park, Hyung-Min ^{[1
,2
]}

机构：

[1] Sogang Univ, Dept Elect Engn, Seoul, South Korea

[2] Sogang Univ, Dept Artificial Intelligence, Seoul, South Korea

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024 | 2024年

关键词：

Multi-channel Speech Enhancement; Drone Audition Environments; Neural Beamforming;

D O I：

10.1109/ICASSPW62465.2024.10626577

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we present an end-to-end neural beamforming method for multi-channel speech enhancement in drone environments with severe noise levels. In flying drone environment, recording with microphones attached to the drone creates a situation where the proximity and intensity of propeller and motor noise result in a low signal-to-noise ratio (SNR) compared to the target speech. EaBNet is a Deep Neural Network (DNN) based beamforming that utilizes embedding and beamforming modules to address the inherent un-interpretability in previous end-to-end beamforming models. EaBNet utilizes spatial information for speech enhancement and seeks additional improvement through PostNet. Building upon EaBNet, we incorporate a module inspired by the structure of the Generalized Sidelobe Canceller (GSC) algorithm to estimate the spatial information of ego-noise through nullforming. Additionally, we suggest estimating the spectral features of nullforming estimation outputs in addition to the beamforming output and for the input of the PostNet to achieve higher performance. As a result, it was confirmed that performance improved through these two methods.

引用

页码：755 / 759

页数：5

共 32 条

[21] CLOSING THE GAP BETWEEN TIME-DOMAIN MULTI-CHANNEL SPEECH ENHANCEMENT ON REAL AND SIMULATION CONDITIONS
Zhang, Wangyou
Shi, Jing
Li, Chenda
Watanabe, Shinji
Qian, Yanmin
2021 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2021, : 146 - 150
[22] ONE MODEL TO ENHANCE THEM ALL: ARRAY GEOMETRY AGNOSTIC MULTI-CHANNEL PERSONALIZED SPEECH ENHANCEMENT
Taherian, Hassan
Eskimez, Sefik Emre
Yoshioka, Takuya
Wang, Huaming
Chen, Zhuo
Huang, Xuedong
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 271 - 275
[23] Signed Convex Combination of Fast Convergence Algorithm to Generalized Sidelobe Canceller Beamformer for Multi-Channel Speech Enhancement
Priyanka, Siva S.
Kumar, Kishore T.
TRAITEMENT DU SIGNAL, 2021, 38 (03) : 785 - 795
[24] MIMO-SPEECH: END-TO-END MULTI-CHANNEL MULTI-SPEAKER SPEECH RECOGNITION
Chang, Xuankai
Zhang, Wangyou
Qian, Yanmin
Le Roux, Jonathan
Watanabe, Shinji
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 237 - 244
[25] TaylorBeamformer: Learning All-Neural Beamformer for Multi-Channel Speech Enhancement from Taylor's Approximation Theory
Li, Andong
Yu, Guochen
Zheng, Chengshi
Li, Xiaodong
INTERSPEECH 2022, 2022, : 5413 - 5417
[26] TaBE: Decoupling spatial and spectral processing with Taylor's unfolding method in the beamspace domain for multi-channel speech enhancement
Li, Andong
Yu, Guochen
Xu, Zhongweiyang
Fan, Cunhang
Li, Xiaodong
Zheng, Chengshi
INFORMATION FUSION, 2024, 101
[27] TOWARDS LOW-DISTORTION MULTI-CHANNEL SPEECH ENHANCEMENT: THE ESPNET-SE SUBMISSION TO THE L3DAS22 CHALLENGE
Lu, Yen-Ju
Cornell, Samuele
Chang, Xuankai
Zhang, Wangyou
Li, Chenda
Ni, Zhaoheng
Wang, Zhong-Qiu
Watanabe, Shinji
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 9201 - 9205
[28] TaylorBeamixer: Learning Taylor-Inspired All-Neural Multi-Channel Speech Enhancement from Beam-Space Dictionary Perspective
Li, Andong
Meng, Weixin
Yu, Guochen
Liu, Wenzhe
Li, Xiaodong
Zheng, Chengshi
INTERSPEECH 2023, 2023, : 1055 - 1059
[29] Deep Neural Network-Based Generalized Sidelobe Canceller for Robust Multi-channel Speech Recognition
Li, Guanjun
Liang, Shan
Nie, Shuai
Liu, Wenju
Yang, Zhanlei
Xiao, Longshuai
INTERSPEECH 2020, 2020, : 51 - 55
[30] Locate and Beamform: Two-dimensional Locating All-neural Beamformer for Multi-channel Speech Separation
Fu, Yanjie
Ge, Meng
Wang, Honglong
Li, Nan
Yin, Haoran
Wang, Longbiao
Zhang, Gaoyan
Dang, Jianwu
Deng, Chengyun
Wang, Fei
INTERSPEECH 2023, 2023, : 3789 - 3793

← 1 2 3 4 →