Exploiting Deep Neural Networks and Head Movements for Robust Binaural Localization of Multiple Sources in Reverberant Environments

被引:94
|
作者
Ma, Ning [1 ]
May, Tobias [2 ]
Brown, Guy J. [1 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
[2] Tech Univ Denmark, Hearing Syst Grp, DK-2800 Lyngby, Denmark
关键词
Binaural sound source localisation; deep neural networks; head movements; machine hearing; multi-conditional training; reverberation; PROBABILISTIC MODEL; CUES;
D O I
10.1109/TASLP.2017.2750760
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel machine-hearing system that exploits deep neural networks (DNNs) and head movements for robust binaural localization of multiple sources in reverberant environments. DNNs are used to learn the relationship between the source azimuth and binaural cues, consisting of the complete cross-correlation function (CCF) and interaural level differences (ILDs). In contrast to many previous binaural hearing systems, the proposed approach is not restricted to localization of sound sources in the frontal hemifield. Due to the similarity of binaural cues in the frontal and rear hemifields, front-back confusions often occur. To address this, a head movement strategy is incorporated in the localization model to help reduce the front-back errors. The proposed DNN system is compared to a Gaussian-mixture-model-based system that employs interaural time differences (ITDs) and ILDs as localization features. Our experiments show that the DNN is able to exploit information in the CCF that is not available in the ITD cue, which together with head movements substantially improves localization accuracies under challenging acoustic scenarios, in which multiple talkers and room reverberation are present.
引用
收藏
页码:2444 / 2453
页数:10
相关论文
共 39 条
  • [1] Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions
    Ma, Ning
    Brown, Guy J.
    May, Tobias
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3302 - 3306
  • [2] Binaural Localization of Multiple Sources in Reverberant and Noisy Environments
    Woodruff, John
    Wang, DeLiang
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (05): : 1503 - 1512
  • [3] Exploiting top-down source models to improve binaural localisation of multiple sources in reverberant environments
    Ma, Ning
    Brown, Guy J.
    Gonzalez, Jose A.
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 160 - 164
  • [4] Binaural reverberant Speech separation based on deep neural networks
    Zhang, Xueliang
    Wang, DeLiang
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2018 - 2022
  • [5] Exploiting Structures of Temporal Causality for Robust Speaker Localization in Reverberant Environments
    Schymura, Christopher
    Guo, Peng
    Maymon, Yanir
    Rafaely, Boaz
    Kolossa, Dorothea
    LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION (LVA/ICA 2018), 2018, 10891 : 228 - 237
  • [6] A MACHINE-HEARING SYSTEM EXPLOITING HEAD MOVEMENTS FOR BINAURAL SOUND LOCALISATION IN REVERBERANT CONDITIONS
    Ma, Ning
    May, Tobias
    Wierstorf, Hagen
    Brown, Guy J.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2699 - 2703
  • [7] ROBUST LOCALISATION OF MULTIPLE SPEAKERS EXPLOITING HEAD MOVEMENTS AND MULTI-CONDITIONAL TRAINING OF BINAURAL CUES
    May, Tobias
    Ma, Ning
    Brown, Guy J.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2679 - 2683
  • [8] Binaural Classification for Reverberant Speech Segregation Using Deep Neural Networks
    Jiang, Yi
    Wang, DeLiang
    Liu, RunSheng
    Feng, ZhenMing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) : 2112 - 2121
  • [9] Low latency localization of multiple sound sources in reverberant environments
    Durkovic, Marko
    Habigt, Tim
    Rothbucher, Martin
    Diepold, Klaus
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2011, 130 (06): : EL392 - EL398
  • [10] ROBUST LOCALIZATION OF MULTIPLE SOURCES IN REVERBERANT ENVIRONMENTS USING EB-ESPRIT WITH SPHERICAL MICROPHONE ARRAYS
    Sun, Haohai
    Teutsch, Heinz
    Mabande, Edwin
    Kellermann, Walter
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 117 - 120