PFRNet: Dual-Branch Progressive Fusion Rectification Network for Monaural Speech Enhancement

被引:12
作者
Yu, Runxiang [1 ,2 ]
Zhao, Ziwei [1 ,2 ]
Ye, Zhongfu [1 ,2 ]
机构
[1] Univ Sci & Technol China, Dept Elect Engn & Informat Sci, Hefei 230027, Anhui, Peoples R China
[2] Natl Engn Res Ctr Speech & Language Informat Proc, Hefei 230027, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Transformers; Speech enhancement; Tensors; Convolution; Decoding; Time-frequency analysis; Fusion rectification block; interactive time-frequency improved transformer; monaural speech enhancement;
D O I
10.1109/LSP.2022.3222045
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, the transformer-based dual-branch magnitude and complex spectrum estimation framework achieves state-of-the-art performance for monaural speech enhancement. However, the insufficient utilization of the interactive information in the middle layers makes each branch lack the ability of compensation and rectification. To address this problem, this letter proposes a novel dual-branch progressive fusion rectification network (PFRNet) for monaural speech enhancement. PFRNet is an encoder-decoder-based dual-branch structure with interactive improved real & complex transformers. In PFRNet, the fusion rectification block is proposed to convert the implicit relationship of the two branches into a fusion feature by the frequency-domain mutual attention mechanism. The fusion feature provides a platform for the interaction in the middle layers. The interactive time-frequency improved real & complex transformer can make better use of the long-term dependencies in the time-frequency domain. Experimental results show that the proposed PFRNet outperforms most advanced dual-branch speech enhancement approaches and previous advanced systems in terms of speech quality and intelligibility.
引用
收藏
页码:2358 / 2362
页数:5
相关论文
共 50 条
  • [31] Salient Object Detection With Dual-Branch Stepwise Feature Fusion and Edge Refinement
    Song, Xiaogang
    Guo, Fuqiang
    Zhang, Lei
    Lu, Xiaofeng
    Hei, Xinhong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2832 - 2844
  • [32] Dual-Path Hybrid Attention Network for Monaural Speech Separation
    Qiu, Wenbo
    Hu, Ying
    IEEE ACCESS, 2022, 10 : 78754 - 78763
  • [33] DEFNet: Dual-Branch Enhanced Feature Fusion Network for RGB-T Crowd Counting
    Zhou, Wujie
    Pan, Yi
    Lei, Jingsheng
    Ye, Lv
    Yu, Lu
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (12) : 24540 - 24549
  • [34] Dual-Branch Spectral–Spatial Attention Network for Hyperspectral Image Classification
    Zhao, Jinling
    Wang, Jiajie
    Ruan, Chao
    Dong, Yingying
    Huang, Linsheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 18
  • [35] A Dual-Branch Spatial-Temporal Learning Network for Video Prediction
    Huang, Huilin
    Guan, Yepeng
    IEEE ACCESS, 2024, 12 : 73258 - 73267
  • [36] A Dual-Branch Detail Extraction Network for Hyperspectral Pansharpening
    Qu, Jiahui
    Hou, Shaoxiong
    Dong, Wenqian
    Xiao, Song
    Du, Qian
    Li, Yunsong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [37] Hyperspectral Image Classification Based on Dual-Branch Spectral Multiscale Attention Network
    Shi, Cuiping
    Liao, Diling
    Xiong, Yi
    Zhang, Tianyu
    Wang, Liguo
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 : 10450 - 10467
  • [38] A novel target decoupling framework based on waveform-spectrum fusion network for monaural speech enhancement
    Yu, Runxiang
    Chen, Wenzhuo
    Ye, Zhongfu
    DIGITAL SIGNAL PROCESSING, 2023, 141
  • [39] Double Adversarial Network based Monaural Speech Enhancement for Robust Speech Recognition
    Du, Zhihao
    Han, Jiqing
    Zhang, Xueliang
    INTERSPEECH 2020, 2020, : 309 - 313
  • [40] MRGAN: LightWeight Monaural Speech Enhancement Using GAN Network
    Meng, Chunyu
    Wei, Guangcun
    Long, Yanhong
    Kong, Chuike
    Ma, Penghao
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT IV, 2025, 15034 : 369 - 377