Binaural Codebook-Based Speech Enhancement With Atomic Speech Presence Probability

被引：10

作者：

Wood, Sean U. N. ^{[1
]}

Stahl, Johannes K. W. ^{[1
]}

Mowlaee, Pejman ^{[1
,2
]}

机构：

[1] Graz Univ Technol, Signal Proc & Speech Commun Lab, A-8010 Graz, Austria

[2] Widex AS, DK-3540 Lynge, Denmark

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2019年 / 27卷 / 12期

基金：

奥地利科学基金会;

关键词：

Speech enhancement; Speech coding; Noise reduction; Noise measurement; Estimation; Indexes; Binaural speech enhancement; atomic speech presence probability; nonnegative matrix factorization; interaural transfer function; QUALITY ASSESSMENT; SOURCE SEPARATION; NOISE-REDUCTION; LOCALIZATION; HEARING; PRESERVATION; ENVIRONMENT; MODEL;

D O I：

10.1109/TASLP.2019.2937174

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this work, we present a universal codebook-based speech enhancement framework that relies on a single codebook to encode both speech and noise components. The atomic speech presence probability (ASPP) is defined as the probability that a given codebook atom encodes speech at a given point in time. We develop ASPP estimators based on binaural cues including the interaural phase and level difference (IPD and ILD), the interaural coherence magnitude (ICM), as well as a combined version leveraging the full interaural transfer function (ITF). We evaluate the performance of the resulting ASPP-based speech enhancement algorithms on binaural mixtures of reverberant speech and real-world noise. The proposed approach improves both objective speech quality and intelligibility over a wide range of input SNR, as measured with PESQ and binaural STOI metrics, outperforming two binaural speech enhancement benchmark methods. We show that the proposed ITF-based ASPP approach achieves a good balance of the trade-off between binaural noise reduction and binaural cue preservation.

引用

页码：2150 / 2161

页数：12

共 50 条

[31] Binaural Auditory Localization Of Signals Processed By Speech Enhancement Methods
Li, Hui
Hong, Xuan
2014 7TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP 2014), 2014, : 883 - 887
[32] Multichannel speech reinforcement based on binaural unmasking
Pak, Junhyeong
Choi, Inyong
Jin, Yu Gwang
Shin, Jong Won
SIGNAL PROCESSING, 2017, 139 : 165 - 172
[33] A Single-Input/Binaural-Output Antiphasic Speech Enhancement Method for Speech Intelligibility Improvement
Pan, Ningning
Wang, Yuzhu
Chen, Jingdong
Benesty, Jacob
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1445 - 1449
[34] A CODEBOOK-DRIVFN SPEECH ENHANCEMENT METHOD BY EXPLOITING SPEECH HARMONICITY
Xiang, Yang
Bao, Changchun
2017 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2017,
[35] Robust Keyword Spotting for Noisy Environments by Leveraging Speech Enhancement and Speech Presence Probability
Yang, Chouchang
Saidutta, Yashas Malur
Srinivasa, Rakshith Sharma
Lee, Ching-Hua
Shen, Yilin
Jin, Hongxia
INTERSPEECH 2023, 2023, : 1638 - 1642
[36] Binaural Deep Neural Network for Robust Speech Enhancement
Jiang, Yi
Liu, Runsheng
2014 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2014, : 692 - 695
[37] Single-channel speech enhancement method using reconstructive NMF with spectrotemporal speech presence probabilities
Lee, Seongjae
Han, David K.
Ko, Hanseok
APPLIED ACOUSTICS, 2017, 117 : 257 - 262
[38] Performance Enhancement of Codebook-Based Beamforming Using Least Squares Approach
Lee, Seung Joon
IEEE WIRELESS COMMUNICATIONS LETTERS, 2024, 13 (02) : 338 - 342
[39] Integrating Codebook and Wiener Filtering for Speech Enhancement
Zhang, Dong-ming
Bao, Chang-chun
Deng, Feng
2015 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), 2015, : 193 - 197
[40] Binaural Speech Enhancement based on DNN for the Application of Virtual Reality
Wang, Jin
Wang, Jing
Liu, Ming
Yan, Zhaoyu
PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 629 - 633

← 1 2 3 4 5 →