Joint Time-Frequency and Time Domain Learning for Speech Enhancement

被引:0
|
作者
Tang, Chuanxin [1 ]
Luo, Chong [1 ]
Zhao, Zhiyuan [1 ]
Xie, Wenxuan [1 ]
Zeng, Wenjun [1 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For single-channel speech enhancement, both time-domain and time-frequency-domain methods have their respective pros and cons. In this paper, we present a cross-domain framework named TFT-Net, which takes time-frequency spectrogram as input and produces time-domain waveform as output. Such a framework takes advantage of the knowledge we have about spectrogram and avoids some of the drawbacks that T-F-domain methods have been suffering from. In TFT-Net, we design an innovative dual-path attention block (DAB) to fully exploit correlations along the time and frequency axes. We further discover that a sample-independent DAB (SDAB) achieves a good trade-off between enhanced speech quality and complexity. Ablation studies show that both the cross-domain design and the SDAB block bring large performance gain. When logarithmic MSE is used as the training criteria, TFT-Net achieves the highest SDR and SSNR among state-of-the-art methods on two major speech enhancement benchmarks.
引用
收藏
页码:3816 / 3822
页数:7
相关论文
共 50 条
  • [41] Joint time-frequency domain identification of nonlinearly controlled structures
    Jin, Gang
    Sain, Michael K.
    Spencer, Billie F., Jr.
    Pham, Khanh D.
    MODELING, SIMULATION, AND VERIFICATION OF SPACE-BASED SYSTEMS III, 2006, 6221
  • [42] Joint time-frequency analysis
    Qian, Shie
    Chen, Dapang
    IEEE Signal Processing Magazine, 1999, 16 (02): : 52 - 67
  • [43] A Time-Frequency Domain Formant Frequency Estimation Scheme for Noisy Speech Signals
    Fattah, S. A.
    Zhu, W-P.
    Ahmad, M. O.
    ISCAS: 2009 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-5, 2009, : 1201 - 1204
  • [44] A Comparative Study of Time and Frequency Domain Approaches to Deep Learning based Speech Enhancement
    Nossier, Soha A.
    Wall, Julie
    Moniri, Mansour
    Glackin, Cornelius
    Cannings, Nigel
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [45] Joint Time-Frequency Scattering
    Anden, Joakim
    Lostanlen, Vincent
    Mallat, Stephane
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2019, 67 (14) : 3704 - 3718
  • [46] FILTERING IN THE TIME-FREQUENCY DOMAIN
    SALEH, BEA
    ASI, MK
    ADVANCED ALGORITHMS AND ARCHITECTURES FOR SIGNAL PROCESSING IV, 1989, 1152 : 426 - 436
  • [47] A time-frequency fusion model for multi-channel speech enhancement
    Zeng, Xiao
    Xu, Shiyun
    Wang, Mingjiang
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01):
  • [48] Wavelet-Based Speech Enhancement Using Time-Frequency Adaptation
    Wang, Kun-Ching
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2009,
  • [49] Variance based time-frequency mask estimation for unsupervised speech enhancement
    Nasir Saleem
    Muhammad Irfan Khattak
    Gunawan Witjaksono
    Gulzar Ahmad
    Multimedia Tools and Applications, 2019, 78 : 31867 - 31891
  • [50] Wavelet-Based Speech Enhancement Using Time-Frequency Adaptation
    Kun-Ching Wang
    EURASIP Journal on Advances in Signal Processing, 2009