The Minimum Overlap-Gap Algorithm for Speech Enhancement

被引：3

作者：

Hoang, Poul ^{[1
,2
]}

Tan, Zheng-Hua ^{[1
]}

De Haan, Jan Mark ^{[2
]}

Jensen, Jesper ^{[1
,2
]}

机构：

[1] Aalborg Univ, Dept Elect Syst, DK-9000 Aalborg, Denmark

[2] Oticon AS, DK-2765 Smoerum, Denmark

来源：

IEEE ACCESS | 2022年 / 10卷

关键词：

Speech enhancement; Noise measurement; Sensors; Microphones; Estimation; Direction-of-arrival estimation; Licenses; turn-taking; multichannel noise reduction; DOA estimation; multi-talker problem; estimation of the talker-of-interest; REVERBERATION; SEPARATION;

D O I：

10.1109/ACCESS.2022.3147514

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we propose a novel speech enhancement paradigm which can effectively solve the problem of retrieving a desired speech signal in a multi-talker environment. The proposed speech enhancement paradigm involves a three-step procedure consisting of separation, ranking, and enhancement. First, a speech separation system - which could be a conventional spatial filter bank or more advanced separation systems - separates mixtures of speech signals captured by microphones into speech signals from candidate speakers. Next, novel ranking algorithms - proposed in this paper - are applied to determine the talker-of-interest amongst the separated speech signals. Finally, the speech signal of the talker-of-interest is estimated as a linear combination of the separated signals, whose weights are determined by the ranking algorithms. We propose ranking algorithms, which exploit turn-taking patterns between conversational partners in order to determine the talker-of-interest amongst competing speakers. Unlike some existing solutions, our ranking algorithms do not require access to additional sensors, e.g., EEG electrodes, cameras, etc., but only rely on microphone signals. Specifically, the proposed algorithms rank the separated speech signals based on the probability of speech overlaps and gaps with the user's own voice. The speech signal with highest ranking is the talker with minimum probability of speech overlap and gap with the user's own voice. The proposed ranking algorithms are shown highly effective at determining the talker-of-interest, since conversational partners, i.e., the user and the talker-of-interest, behaviorally avoid speech overlaps and gaps. We evaluate the proposed speech enhancement paradigm in two practical hearing aid related applications, where the objective is to enhance a speech signal of a conversational partner in a multi-talker environment. The results of the evaluation demonstrate that the proposed speech enhancement systems in both applications significantly outperform conventional speech enhancement systems.

引用

页码：14698 / 14716

页数：19

共 43 条

[1] A Tutorial on Auditory Attention Identification Methods
Alickovic, Emina
Lunner, Thomas
Gustafsson, Fredrik
Ljung, Lennart
[J]. FRONTIERS IN NEUROSCIENCE, 2019, 13
[2] Aroudi A, 2017, IEEE SYS MAN CYBERN, P3042, DOI 10.1109/SMC.2017.8123092
[3] DiapixUK: task materials for the elicitation of multiple spontaneous speech dialogs
Baker, Rachel
Hazan, Valerie
[J]. BEHAVIOR RESEARCH METHODS, 2011, 43 (03) : 761 - 770
[4] Basu S., 2002, THESIS MASSACHUSETTS
[5] Brandstein M, 2001, MICROPHONE ARRAYS SI
[6] Evaluation and Comparison of Late Reverberation Power Spectral Density Estimators
Braun, Sebastian
Kuklasinski, Adam
Schwartz, Ofer
Thiergart, Oliver
Habets, Emanuel A. P.
Gannot, Sharon
Doclo, Simon
Jensen, Jesper
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (06) : 1052 - 1067
[7] Chakrabarty S, 2017, IEEE WORK APPL SIG, P136, DOI 10.1109/WASPAA.2017.8170010
[8] Choudhury T., 2005, ADV NEURAL INFORM PR, V17
[9] EEG-based auditory attention detection: boundary conditions for background noise and speaker positions
Das, Neetha
Bertrand, Alexander
Francart, Tom
[J]. JOURNAL OF NEURAL ENGINEERING, 2018, 15 (06)
[10] Improving Speech Intelligibility by Hearing Aid Eye-Gaze Steering: Conditions With Head Fixated in a Multitalker Environment
Favre-Felix, Antoine
Graversen, Carina
Hietkamp, Renskje K.
Dau, Torsten
Lunner, Thomas
[J]. TRENDS IN HEARING, 2018, 22

← 1 2 3 4 5 →