MULTI-CHANNEL SPEECH ENHANCEMENT USING BEAMFORMING AND NULLFORMING FOR SEVERELY ADVERSE DRONE ENVIRONMENT

被引:0
作者
Kim, Seokhyun [1 ]
Jeong, Won [2 ]
Park, Hyung-Min [1 ,2 ]
机构
[1] Sogang Univ, Dept Elect Engn, Seoul, South Korea
[2] Sogang Univ, Dept Artificial Intelligence, Seoul, South Korea
来源
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024 | 2024年
关键词
Multi-channel Speech Enhancement; Drone Audition Environments; Neural Beamforming;
D O I
10.1109/ICASSPW62465.2024.10626577
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we present an end-to-end neural beamforming method for multi-channel speech enhancement in drone environments with severe noise levels. In flying drone environment, recording with microphones attached to the drone creates a situation where the proximity and intensity of propeller and motor noise result in a low signal-to-noise ratio (SNR) compared to the target speech. EaBNet is a Deep Neural Network (DNN) based beamforming that utilizes embedding and beamforming modules to address the inherent un-interpretability in previous end-to-end beamforming models. EaBNet utilizes spatial information for speech enhancement and seeks additional improvement through PostNet. Building upon EaBNet, we incorporate a module inspired by the structure of the Generalized Sidelobe Canceller (GSC) algorithm to estimate the spatial information of ego-noise through nullforming. Additionally, we suggest estimating the spectral features of nullforming estimation outputs in addition to the beamforming output and for the input of the PostNet to achieve higher performance. As a result, it was confirmed that performance improved through these two methods.
引用
收藏
页码:755 / 759
页数:5
相关论文
共 32 条
  • [21] CLOSING THE GAP BETWEEN TIME-DOMAIN MULTI-CHANNEL SPEECH ENHANCEMENT ON REAL AND SIMULATION CONDITIONS
    Zhang, Wangyou
    Shi, Jing
    Li, Chenda
    Watanabe, Shinji
    Qian, Yanmin
    2021 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2021, : 146 - 150
  • [22] ONE MODEL TO ENHANCE THEM ALL: ARRAY GEOMETRY AGNOSTIC MULTI-CHANNEL PERSONALIZED SPEECH ENHANCEMENT
    Taherian, Hassan
    Eskimez, Sefik Emre
    Yoshioka, Takuya
    Wang, Huaming
    Chen, Zhuo
    Huang, Xuedong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 271 - 275
  • [23] Signed Convex Combination of Fast Convergence Algorithm to Generalized Sidelobe Canceller Beamformer for Multi-Channel Speech Enhancement
    Priyanka, Siva S.
    Kumar, Kishore T.
    TRAITEMENT DU SIGNAL, 2021, 38 (03) : 785 - 795
  • [24] MIMO-SPEECH: END-TO-END MULTI-CHANNEL MULTI-SPEAKER SPEECH RECOGNITION
    Chang, Xuankai
    Zhang, Wangyou
    Qian, Yanmin
    Le Roux, Jonathan
    Watanabe, Shinji
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 237 - 244
  • [25] TaylorBeamformer: Learning All-Neural Beamformer for Multi-Channel Speech Enhancement from Taylor's Approximation Theory
    Li, Andong
    Yu, Guochen
    Zheng, Chengshi
    Li, Xiaodong
    INTERSPEECH 2022, 2022, : 5413 - 5417
  • [26] TaBE: Decoupling spatial and spectral processing with Taylor's unfolding method in the beamspace domain for multi-channel speech enhancement
    Li, Andong
    Yu, Guochen
    Xu, Zhongweiyang
    Fan, Cunhang
    Li, Xiaodong
    Zheng, Chengshi
    INFORMATION FUSION, 2024, 101
  • [27] TOWARDS LOW-DISTORTION MULTI-CHANNEL SPEECH ENHANCEMENT: THE ESPNET-SE SUBMISSION TO THE L3DAS22 CHALLENGE
    Lu, Yen-Ju
    Cornell, Samuele
    Chang, Xuankai
    Zhang, Wangyou
    Li, Chenda
    Ni, Zhaoheng
    Wang, Zhong-Qiu
    Watanabe, Shinji
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 9201 - 9205
  • [28] TaylorBeamixer: Learning Taylor-Inspired All-Neural Multi-Channel Speech Enhancement from Beam-Space Dictionary Perspective
    Li, Andong
    Meng, Weixin
    Yu, Guochen
    Liu, Wenzhe
    Li, Xiaodong
    Zheng, Chengshi
    INTERSPEECH 2023, 2023, : 1055 - 1059
  • [29] Deep Neural Network-Based Generalized Sidelobe Canceller for Robust Multi-channel Speech Recognition
    Li, Guanjun
    Liang, Shan
    Nie, Shuai
    Liu, Wenju
    Yang, Zhanlei
    Xiao, Longshuai
    INTERSPEECH 2020, 2020, : 51 - 55
  • [30] Locate and Beamform: Two-dimensional Locating All-neural Beamformer for Multi-channel Speech Separation
    Fu, Yanjie
    Ge, Meng
    Wang, Honglong
    Li, Nan
    Yin, Haoran
    Wang, Longbiao
    Zhang, Gaoyan
    Dang, Jianwu
    Deng, Chengyun
    Wang, Fei
    INTERSPEECH 2023, 2023, : 3789 - 3793