SPATIAL-TEMPORAL GRAPH CONVOLUTION NETWORK FOR MULTICHANNEL SPEECH ENHANCEMENT

被引:4
|
作者
Hao, Minghui [1 ]
Yu, Jingjing [1 ]
Zhang, Luyao [1 ]
机构
[1] Beijing Jiaotong Univ, Elect & Informat Engn, Beijing, Peoples R China
来源
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年
关键词
Graph convolution network; spatial dependency extraction; spatial-temporal convolution module; SII-weighted loss function; speech enhancement;
D O I
10.1109/ICASSP43922.2022.9746054
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Spatial dependency related to distributed microphone positions is essential for multichannel speech enhancement task. It is still challenging due to lack of accurate array positions and complex spatial-temporal relations of multichannel noisy signals This paper proposes a spatial-temporal graph convolutional network composed of cascaded spatial-temporal (ST) modules with channel fusion. Without any prior information of array and acoustic scene, a graph convolution block is designed with learnable adjacency matrix to capture the spatial dependency of pairwise channels. Then, it is embedded with time-frequency convolution block as the ST module to fuse the multi-dimensional correlation features for target speech estimation. Furthermore, a novel weighted loss function based on speech intelligibility index (SII) is proposed to assign more attention for the important bands of human understanding during network training. Our framework is demonstrated to achieve over 11% performance improvement on PESQ and intelligibility against prior state-of-the-art approaches in multi-scene speech enhancement experiments.
引用
收藏
页码:6512 / 6516
页数:5
相关论文
共 50 条
  • [21] COMPLEX-VALUED SPATIAL AUTOENCODERS FOR MULTICHANNEL SPEECH ENHANCEMENT
    Halimeh, Mhd Modar
    Kellermann, Walter
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 261 - 265
  • [22] Multi-source information fusion for dynamic safety risk prediction of aerial building machine using spatial-temporal multi-graph convolution network
    Wang, Jiaqi
    Fan, Yuqing
    Pan, Xi
    Sun, Jun
    Zhang, Limao
    ADVANCED ENGINEERING INFORMATICS, 2025, 65
  • [23] Spatial temporal graph convolution network for the analysis of regional wall motion in left ventricular opacification echocardiography
    Cui, Rongpu
    He, Wenfeng
    Huang, Junhao
    Zhang, Junyan
    Zhang, Haozhe
    Liang, Shichu
    He, Yujun
    Liu, Zhiyue
    Gao, Shaobing
    He, Yong
    Peng, Jian
    Huang, He
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 103
  • [24] A Spatial-Temporal Similar Graph Attention Network for Cyber Physical System Perception via Traffic Forecasting
    Zhao, Kaidi
    Xu, Mingyue
    Yang, Zhengzhuang
    Han, Dingding
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2022, 31 (06)
  • [25] NONLINEAR SPATIAL FILTERING FOR MULTICHANNEL SPEECH ENHANCEMENT IN INHOMOGENEOUS NOISE FIELDS
    Tesch, Kristina
    Gerkmann, Timo
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 196 - 200
  • [26] DConvT: Deep Convolution-Transformer Network Utilizing Multi-scale Temporal Attention for Speech Enhancement
    Hoang Ngoc Chau
    Anh Xuan Tran Thi
    Quoc Cuong Nguyen
    2024 IEEE TENTH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS, ICCE 2024, 2024, : 398 - 402
  • [27] Multichannel parametric speech enhancement
    Srinivasan, S
    Aichner, R
    Kleijn, WB
    Kellermann, W
    IEEE SIGNAL PROCESSING LETTERS, 2006, 13 (05) : 304 - 307
  • [28] Complex Event Recognition via Spatial-Temporal Relation Graph Reasoning
    Lin, Huan
    Zhao, Hongtian
    Yang, Hua
    2021 INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2021,
  • [29] A signal subspace approach to spatio-temporal prediction for multichannel speech enhancement
    Adam Borowicz
    EURASIP Journal on Audio, Speech, and Music Processing, 2015
  • [30] ST_AGCNT: Traffic Speed Forecasting Based on Spatial-Temporal Adaptive Graph Convolutional Network with Transformer
    Cheng, Rongjun
    Liu, Mengxia
    Xu, Yuanzi
    SUSTAINABILITY, 2025, 17 (05)