DE-DPCTnet: Deep Encoder Dual-path Convolutional Transformer Network for Multi-channel Speech Separation

被引:0
|
作者
Wang, Zhenyu [1 ,2 ,4 ]
Zhou, Yi [1 ,2 ]
Gan, Lu [3 ,4 ]
Chen, Rilin
Tang, Xinyu [1 ,2 ]
Liu, Hongqing [1 ,2 ]
机构
[1] Chongqing Univ Posts & Telecommun, Chongqing 400065, Peoples R China
[2] Chongqing Key Lab Signal & Informat Proc, Chongqing 400065, Peoples R China
[3] Brunel Univ, Coll Engn Design & Phys Sci, London UB8 3PH, England
[4] Tencent AI Lab, Beijing, Peoples R China
来源
2022 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS) | 2022年
关键词
Speech separation; multi-channel; deep encoder; improved transformer; beamforming; TASNET;
D O I
10.1109/SIPS55645.2022.9919247
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, beamforming has been extensively investigated in multi-channel speech separation task. In this paper, we propose a deep encoder dual-path convolutional transformer network (DE-DPCTnet), which directly estimates the beamforming filters for speech separation task in time domain. In order to learn the signal repetitions correctly, nonlinear deep encoder module is proposed to replace the traditional linear one. The improved transformer is also developed by utilizing convolutions to capture long-time speech sequences. The ablation studies demonstrate that the deep encoder and improved transformer indeed benefit the separation performance. The comparisons show that the DE-DPCTnet outperforms the state-of-the-art filter-and-sum network with transform-average-concatenate module (FaSNet-TAC), even with a lower computational complexity.
引用
收藏
页码:180 / 184
页数:5
相关论文
共 28 条
  • [1] DCE-CDPPTnet: Dense Connected Encoder Cross Dual-path Parrel Transformer Network for Multi-channel Speech Separation
    Zhuang, Chenghao
    Zhou, Lin
    Cao, Yanxiang
    Wang, Qirui
    Cheng, Yunling
    2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024, 2024, : 303 - 308
  • [2] Deep encoder/decoder dual-path neural network for speech separation in noisy reverberation environments
    Wang, Chunxi
    Jia, Maoshen
    Zhang, Xinfeng
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
  • [3] Deep encoder/decoder dual-path neural network for speech separation in noisy reverberation environments
    Chunxi Wang
    Maoshen Jia
    Xinfeng Zhang
    EURASIP Journal on Audio, Speech, and Music Processing, 2023
  • [4] Dual-Path Hybrid Attention Network for Monaural Speech Separation
    Qiu, Wenbo
    Hu, Ying
    IEEE ACCESS, 2022, 10 : 78754 - 78763
  • [5] DEEP COMPLEX CONVOLUTIONAL RECURRENT NETWORK FOR MULTI-CHANNEL SPEECH ENHANCEMENT AND DEREVERBERATION
    Gelderblom, Femke B.
    Myrvoll, Tor Andre
    2021 IEEE 31ST INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2021,
  • [6] SPLIT-ATTENTION MECHANISMS WITH GRAPH CONVOLUTIONAL NETWORK FOR MULTI-CHANNEL SPEECH SEPARATION
    Tan, YingWei
    Ding, XueFeng
    2024 18TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT, IWAENC 2024, 2024, : 140 - 144
  • [7] Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation
    Yang, Xue
    Bao, Changchun
    INTERSPEECH 2022, 2022, : 5338 - 5342
  • [8] Dual-Path Filter Network: Speaker-Aware Modeling for Speech Separation
    Wang, Fan-Lin
    Peng, Yu-Huai
    Lee, Hung-Shin
    Wang, Hsin-Min
    INTERSPEECH 2021, 2021, : 3061 - 3065
  • [9] Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation
    Chen, Jingjing
    Mao, Qirong
    Liu, Dong
    INTERSPEECH 2020, 2020, : 2642 - 2646
  • [10] Speech Separation Based on Multi-Stage Elaborated Dual-Path Deep BiLSTM with Auxiliary Identity Loss
    Shi, Ziqiang
    Liu, Rujie
    Han, Jiqing
    INTERSPEECH 2020, 2020, : 2682 - 2686