Binaural Codebook-Based Speech Enhancement With Atomic Speech Presence Probability

被引：10

作者：

Wood, Sean U. N. ^{[1
]}

Stahl, Johannes K. W. ^{[1
]}

Mowlaee, Pejman ^{[1
,2
]}

机构：

[1] Graz Univ Technol, Signal Proc & Speech Commun Lab, A-8010 Graz, Austria

[2] Widex AS, DK-3540 Lynge, Denmark

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2019年 / 27卷 / 12期

基金：

奥地利科学基金会;

关键词：

Speech enhancement; Speech coding; Noise reduction; Noise measurement; Estimation; Indexes; Binaural speech enhancement; atomic speech presence probability; nonnegative matrix factorization; interaural transfer function; QUALITY ASSESSMENT; SOURCE SEPARATION; NOISE-REDUCTION; LOCALIZATION; HEARING; PRESERVATION; ENVIRONMENT; MODEL;

D O I：

10.1109/TASLP.2019.2937174

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this work, we present a universal codebook-based speech enhancement framework that relies on a single codebook to encode both speech and noise components. The atomic speech presence probability (ASPP) is defined as the probability that a given codebook atom encodes speech at a given point in time. We develop ASPP estimators based on binaural cues including the interaural phase and level difference (IPD and ILD), the interaural coherence magnitude (ICM), as well as a combined version leveraging the full interaural transfer function (ITF). We evaluate the performance of the resulting ASPP-based speech enhancement algorithms on binaural mixtures of reverberant speech and real-world noise. The proposed approach improves both objective speech quality and intelligibility over a wide range of input SNR, as measured with PESQ and binaural STOI metrics, outperforming two binaural speech enhancement benchmark methods. We show that the proposed ITF-based ASPP approach achieves a good balance of the trade-off between binaural noise reduction and binaural cue preservation.

引用

页码：2150 / 2161

页数：12

共 50 条

[21] CUE-PRESERVING MMSE FILTER WITH BAYESIAN SNR MARGINALIZATION FOR BINAURAL SPEECH ENHANCEMENT
Thaleiser, Stefan
Enzner, Gerald
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6124 - 6128
[22] Speech Enhancement Based on Codebook Constrained Nonnegative Matrix Factorization
Bai, Zhigang
Bao, Changchun
Yan, Bofang
2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 361 - 365
[23] Unsupervised Speech Enhancement Using Optimal Transport and Speech Presence Probability
Jiang, Wenbin
Yu, Kai
Wen, Fei
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4445 - 4455
[24] Deep Latent Fusion Layers for Binaural Speech Enhancement
Gajecki, Tom
Nogueira, Waldo
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3127 - 3138
[25] BINAURAL SPEECH ENHANCEMENT USING DEEP COMPLEX CONVOLUTIONAL TRANSFORMER NETWORKS
Tokala, Vikas
Grinstein, Eric
Brookes, Mike
Doclo, Simon
Jensen, Jesper
Naylor, Patrick A.
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 681 - 685
[26] Speech Enhancement Based on Teacher-Student Deep Learning Using Improved Speech Presence Probability for Noise-Robust Speech Recognition
Tu, Yan-Hui
Du, Jun
Lee, Chin-Hui
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2080 - 2091
[27] DNN-BASED SPEECH PRESENCE PROBABILITY ESTIMATION FORMULTI-FRAME SINGLE-MICROPHONE SPEECH ENHANCEMENT
Tammen, Marvin
Fischer, Doerte
Meyer, Bernd T.
Doclo, Simon
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 191 - 195
[28] Speech Enhancement Combining NMF Weighted by Speech Presence Probability and Statistical Model
Hu, Yonggang
Zhang, Xiongwei
Zou, Xia
Min, Gang
Sun, Meng
Zheng, Yunfei
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2015, E98A (12) : 2701 - 2704
[29] MODEL BASED BINAURAL ENHANCEMENT OF VOICED AND UNVOICED SPEECH
Kavalekalam, Mathew Shaji
Christensen, Mads Graesboll
Boldt, Jesper B.
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 666 - 670
[30] Speech enhancement methods based on binaural cue coding
Wang, Xianyun
Bao, Changchun
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2019, 2019 (01)

← 1 2 3 4 5 →