A Low Computation Cost Model for Real-Time Speech Enhancement

被引:0
|
作者
Wang, Qirui [1 ]
Zhou, Lin [1 ]
Cao, Yanxiang [1 ]
Zhuang, Chenghao [1 ]
Cheng, Yunling [1 ]
Deng, Yuxi [1 ]
机构
[1] Southeast Univ, Sch Informat Sci & Engn, Nanjing, Peoples R China
来源
2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024 | 2024年
关键词
conformer; low computation cost; real-time; speech enhancement;
D O I
10.1109/ICCCAS62034.2024.10652686
中图分类号
学科分类号
摘要
Developing speech enhancement systems for real-time scenarios has been a challenge due to the need for low computation complexity, parallel processing, and a causal structure. In this paper, we propose a speech enhancement model that works on time-frequency domain with all operations being 1D-dimensional to reduce computation cost. Specifically, the proposed model follows a U-Net structure with several conformer blocks inserted. Our evaluation on DNS Challenge and VoiceBank + DEMAND benchmarks shows that our model performs comparably to other state-of-the-art causal systems. Most importantly, the proposed model only needs 0.70G MACs when processing 16000 samples (1 second) speech signal and achieves an RTF (Real Time Factor) of 0.012, thus indicating that the model significantly reduces the computational cost.
引用
收藏
页码:267 / 271
页数:5
相关论文
共 50 条
  • [1] Real-time Speech Enhancement with GCC-NMF
    Wood, Sean U. N.
    Rouat, Jean
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2665 - 2669
  • [2] SPEECH SEGMENT CLUSTERING FOR REAL-TIME EXEMPLAR-BASED SPEECH ENHANCEMENT
    Nesbitt, David
    Crookes, Danny
    Ming, Ji
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5419 - 5423
  • [3] Lite-RTSE: Exploring a Cost-Effective Lite DNN Model for Real-Time Speech Enhancement in RTC Scenarios
    Liang, Xingwei
    Zhang, Lu
    Wu, Zhiyong
    Xu, Ruifeng
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1697 - 1701
  • [4] ONLINE REPET-SIM FOR REAL-TIME SPEECH ENHANCEMENT
    Rafii, Zafar
    Pardo, Bryan
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 848 - 852
  • [5] Real-Time Codebook-based Speech Enhancement with GPUs
    Prasanna, A. N. Sai
    Gurumurthyt, Iver Chandrashekaran
    Naidu, D. H. R.
    Baruith, Pallav Kuniar
    2014 INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2014, : 306 - 311
  • [6] DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
    Schroeter, Hendrik
    Escalante-B, Alberto N.
    Rosenkranz, Tobias
    Maier, Andreas
    INTERSPEECH 2023, 2023, : 2008 - 2009
  • [7] Lightweight Real-Time Recurrent Models for Speech Enhancement and Automatic Speech Recognition
    Dhahbi, Sami
    Saleem, Nasir
    Gunawan, Teddy Surya
    Bourouis, Sami
    Ali, Imad
    Trigui, Aymen
    Algarni, Abeer D.
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2024, 8 (06): : 74 - 85
  • [8] Real-time speech enhancement by adaptive spectral subtraction method
    Wang, Jingfang
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 3774 - 3778
  • [9] A real-time kepstrum approach to speech enhancement and noise cancellation
    Jeong, J.
    Moir, T. J.
    NEUROCOMPUTING, 2008, 71 (13-15) : 2635 - 2649
  • [10] Application for Real-time Audio-Visual Speech Enhancement
    Gogate, Mandar
    Dashtipour, Kia
    Hussain, Amir
    INTERSPEECH 2023, 2023, : 2026 - 2027