The Minimum Overlap-Gap Algorithm for Speech Enhancement

被引:3
作者
Hoang, Poul [1 ,2 ]
Tan, Zheng-Hua [1 ]
De Haan, Jan Mark [2 ]
Jensen, Jesper [1 ,2 ]
机构
[1] Aalborg Univ, Dept Elect Syst, DK-9000 Aalborg, Denmark
[2] Oticon AS, DK-2765 Smoerum, Denmark
关键词
Speech enhancement; Noise measurement; Sensors; Microphones; Estimation; Direction-of-arrival estimation; Licenses; turn-taking; multichannel noise reduction; DOA estimation; multi-talker problem; estimation of the talker-of-interest; REVERBERATION; SEPARATION;
D O I
10.1109/ACCESS.2022.3147514
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a novel speech enhancement paradigm which can effectively solve the problem of retrieving a desired speech signal in a multi-talker environment. The proposed speech enhancement paradigm involves a three-step procedure consisting of separation, ranking, and enhancement. First, a speech separation system - which could be a conventional spatial filter bank or more advanced separation systems - separates mixtures of speech signals captured by microphones into speech signals from candidate speakers. Next, novel ranking algorithms - proposed in this paper - are applied to determine the talker-of-interest amongst the separated speech signals. Finally, the speech signal of the talker-of-interest is estimated as a linear combination of the separated signals, whose weights are determined by the ranking algorithms. We propose ranking algorithms, which exploit turn-taking patterns between conversational partners in order to determine the talker-of-interest amongst competing speakers. Unlike some existing solutions, our ranking algorithms do not require access to additional sensors, e.g., EEG electrodes, cameras, etc., but only rely on microphone signals. Specifically, the proposed algorithms rank the separated speech signals based on the probability of speech overlaps and gaps with the user's own voice. The speech signal with highest ranking is the talker with minimum probability of speech overlap and gap with the user's own voice. The proposed ranking algorithms are shown highly effective at determining the talker-of-interest, since conversational partners, i.e., the user and the talker-of-interest, behaviorally avoid speech overlaps and gaps. We evaluate the proposed speech enhancement paradigm in two practical hearing aid related applications, where the objective is to enhance a speech signal of a conversational partner in a multi-talker environment. The results of the evaluation demonstrate that the proposed speech enhancement systems in both applications significantly outperform conventional speech enhancement systems.
引用
收藏
页码:14698 / 14716
页数:19
相关论文
共 43 条
  • [1] A Tutorial on Auditory Attention Identification Methods
    Alickovic, Emina
    Lunner, Thomas
    Gustafsson, Fredrik
    Ljung, Lennart
    [J]. FRONTIERS IN NEUROSCIENCE, 2019, 13
  • [2] Aroudi A, 2017, IEEE SYS MAN CYBERN, P3042, DOI 10.1109/SMC.2017.8123092
  • [3] DiapixUK: task materials for the elicitation of multiple spontaneous speech dialogs
    Baker, Rachel
    Hazan, Valerie
    [J]. BEHAVIOR RESEARCH METHODS, 2011, 43 (03) : 761 - 770
  • [4] Basu S., 2002, THESIS MASSACHUSETTS
  • [5] Brandstein M, 2001, MICROPHONE ARRAYS SI
  • [6] Evaluation and Comparison of Late Reverberation Power Spectral Density Estimators
    Braun, Sebastian
    Kuklasinski, Adam
    Schwartz, Ofer
    Thiergart, Oliver
    Habets, Emanuel A. P.
    Gannot, Sharon
    Doclo, Simon
    Jensen, Jesper
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (06) : 1052 - 1067
  • [7] Chakrabarty S, 2017, IEEE WORK APPL SIG, P136, DOI 10.1109/WASPAA.2017.8170010
  • [8] Choudhury T., 2005, ADV NEURAL INFORM PR, V17
  • [9] EEG-based auditory attention detection: boundary conditions for background noise and speaker positions
    Das, Neetha
    Bertrand, Alexander
    Francart, Tom
    [J]. JOURNAL OF NEURAL ENGINEERING, 2018, 15 (06)
  • [10] Improving Speech Intelligibility by Hearing Aid Eye-Gaze Steering: Conditions With Head Fixated in a Multitalker Environment
    Favre-Felix, Antoine
    Graversen, Carina
    Hietkamp, Renskje K.
    Dau, Torsten
    Lunner, Thomas
    [J]. TRENDS IN HEARING, 2018, 22