REAL-TIME JOINT NOISE SUPPRESSION AND BANDWIDTH EXTENSION OF NOISY REVERBERANT WIDEBAND SPEECH

被引:0
|
作者
Gomez, Esteban [1 ,2 ]
Backstrom, Tom [1 ]
机构
[1] Aalto Univ, Dept Informat & Commun Engn, Espoo, Finland
[2] Voicemod Inc, Valencia, Spain
来源
2024 18TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT, IWAENC 2024 | 2024年
关键词
Bandwidth extension; noise suppression; real-time; deep learning; multitasking; PERCEPTION;
D O I
10.1109/IWAENC61483.2024.10694458
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Artificially extending the bandwidth of speech in real-time applications that are band-limited to 16 kHz (known as wide-band) or lower sample rates such as VoIP or communication over Bluetooth, can significantly improve its perceptual quality. Typically, dry clean speech is assumed as input to estimate the missing spectral information. However, such an assumption falls short if the input speech is reverberant or has been contaminated by noise, resulting in audible artifacts. We propose a real-time low-complexity multitasking neural network capable of performing noise suppression and bandwidth extension from 16 kHz to 48 kHz (fullband) on a CPU, preventing such issues even if the noise cannot be completely removed from the input. Instead of employing a monolithic model, we adopt a modular approach and complexity reduction methods that result in a more compact model than the sum of its parts while improving its performance.
引用
收藏
页码:6 / 10
页数:5
相关论文
共 50 条
  • [21] Real-Time Baby Crying Detection in the Noisy Everyday Environment
    Foo, Lee Sze
    Yap, Wun-She
    Hum, Yan Chai
    Kadim, Zulaikha
    Hon, Hock Woon
    Tee, Yee Kai
    2020 11TH IEEE CONTROL AND SYSTEM GRADUATE RESEARCH COLLOQUIUM (ICSGRC), 2020, : 26 - 31
  • [22] Real-time lexical competitions during speech-in-speech comprehension
    Boulenger, Veronique
    Hoen, Michel
    Ferragne, Emmanuel
    Pellegrino, Francois
    Meunier, Fanny
    SPEECH COMMUNICATION, 2010, 52 (03) : 246 - 253
  • [23] Real-time lexical competitions during speech-in-speech comprehension
    Boulenger, Veronique
    Hoen, Michel
    Pellegrino, Francois
    Meunier, Fanny
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1839 - +
  • [24] Implementation of Real-time Network Extension on Embedded Linux
    Tian, Yuan
    Ren, Guoqiang
    Wu, Qinzhang
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS, 2009, : 163 - 167
  • [25] Bandwidth Stealing TDMA Arbitration for Real-Time Multiprocessor Applications
    Nadeem, Muhammad
    Park, Heejong
    Malik, Avinash
    PROCEEDINGS OF TENCON 2018 - 2018 IEEE REGION 10 CONFERENCE, 2018, : 1504 - 1509
  • [26] Real-time implementation of MUSIC for wideband acoustic detection and tracking
    Pham, T
    Fong, MF
    AUTOMATIC TARGET RECOGNITION VII, 1997, 3069 : 250 - 256
  • [27] BRU: Bandwidth Regulation Unit for Real-Time Multicore Processors
    Farshchi, Farzad
    Huang, Qijing
    Yun, Heechul
    2020 IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM (RTAS 2020), 2020, : 364 - 375
  • [28] Real-time Speech Enhancement with GCC-NMF
    Wood, Sean U. N.
    Rouat, Jean
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2665 - 2669
  • [29] A robust algorithm for real-time endpoint detection in the noisy mobile environments
    Wu, B
    Ren, XL
    Liu, CQ
    Zhang, YX
    CHINESE JOURNAL OF ELECTRONICS, 2003, 12 (04): : 579 - 582
  • [30] SPEECH SEGMENT CLUSTERING FOR REAL-TIME EXEMPLAR-BASED SPEECH ENHANCEMENT
    Nesbitt, David
    Crookes, Danny
    Ming, Ji
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5419 - 5423