A Low Computation Cost Model for Real-Time Speech Enhancement

被引：0

作者：

Wang, Qirui ^{[1
]}

Zhou, Lin ^{[1
]}

Cao, Yanxiang ^{[1
]}

Zhuang, Chenghao ^{[1
]}

Cheng, Yunling ^{[1
]}

Deng, Yuxi ^{[1
]}

机构：

[1] Southeast Univ, Sch Informat Sci & Engn, Nanjing, Peoples R China

来源：

2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024 | 2024年

关键词：

conformer; low computation cost; real-time; speech enhancement;

D O I：

10.1109/ICCCAS62034.2024.10652686

中图分类号：

学科分类号：

摘要：

Developing speech enhancement systems for real-time scenarios has been a challenge due to the need for low computation complexity, parallel processing, and a causal structure. In this paper, we propose a speech enhancement model that works on time-frequency domain with all operations being 1D-dimensional to reduce computation cost. Specifically, the proposed model follows a U-Net structure with several conformer blocks inserted. Our evaluation on DNS Challenge and VoiceBank + DEMAND benchmarks shows that our model performs comparably to other state-of-the-art causal systems. Most importantly, the proposed model only needs 0.70G MACs when processing 16000 samples (1 second) speech signal and achieves an RTF (Real Time Factor) of 0.012, thus indicating that the model significantly reduces the computational cost.

引用

页码：267 / 271

页数：5

共 50 条

[1] Real-time Speech Enhancement with GCC-NMF
Wood, Sean U. N.
Rouat, Jean
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2665 - 2669
[2] SPEECH SEGMENT CLUSTERING FOR REAL-TIME EXEMPLAR-BASED SPEECH ENHANCEMENT
Nesbitt, David
Crookes, Danny
Ming, Ji
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5419 - 5423
[3] Lite-RTSE: Exploring a Cost-Effective Lite DNN Model for Real-Time Speech Enhancement in RTC Scenarios
Liang, Xingwei
Zhang, Lu
Wu, Zhiyong
Xu, Ruifeng
IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1697 - 1701
[4] ONLINE REPET-SIM FOR REAL-TIME SPEECH ENHANCEMENT
Rafii, Zafar
Pardo, Bryan
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 848 - 852
[5] Real-Time Codebook-based Speech Enhancement with GPUs
Prasanna, A. N. Sai
Gurumurthyt, Iver Chandrashekaran
Naidu, D. H. R.
Baruith, Pallav Kuniar
2014 INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2014, : 306 - 311
[6] DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
Schroeter, Hendrik
Escalante-B, Alberto N.
Rosenkranz, Tobias
Maier, Andreas
INTERSPEECH 2023, 2023, : 2008 - 2009
[7] Lightweight Real-Time Recurrent Models for Speech Enhancement and Automatic Speech Recognition
Dhahbi, Sami
Saleem, Nasir
Gunawan, Teddy Surya
Bourouis, Sami
Ali, Imad
Trigui, Aymen
Algarni, Abeer D.
INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2024, 8 (06): : 74 - 85
[8] Real-time speech enhancement by adaptive spectral subtraction method
Wang, Jingfang
MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 3774 - 3778
[9] A real-time kepstrum approach to speech enhancement and noise cancellation
Jeong, J.
Moir, T. J.
NEUROCOMPUTING, 2008, 71 (13-15) : 2635 - 2649
[10] Application for Real-time Audio-Visual Speech Enhancement
Gogate, Mandar
Dashtipour, Kia
Hussain, Amir
INTERSPEECH 2023, 2023, : 2026 - 2027

← 1 2 3 4 5 →