DeepEar: Sound Localization With Binaural Microphones

被引:7
|
作者
Yang, Qiang [1 ]
Zheng, Yuanqing [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China
关键词
Binaural localization; multi-source localization; earable computing; NEURAL-NETWORKS; HEAD MOVEMENTS; NOISE; DIFFERENCE; FEATURES; SEARCH;
D O I
10.1109/TMC.2022.3222821
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The binaural microphone, which refers to a pair of microphones with artificial human-shaped ears, is widely used in hearing aids and spatial audio recording to improve sound quality. It is crucial for such devices to find the voice direction in many applications such as binaural sound enhancement. However, sound localization with two microphones remains challenging, especially in multi-source scenarios. Most previous work utilized microphone arrays to deal with the multi-source localization problem. Extra microphones yet have space constraints for deployment in many scenarios (e.g., hearing aids). Inspired by the fact that humans have evolved to locate multiple sound sources with only two ears, we propose DeepEar, a binaural microphone-based sound localization system. To this end, we design a multisector-based neural network to locate multiple sound sources simultaneously, where each sector is a discretized region of the space for different angle of arrivals. DeepEar fuses explicit hand-crafted features and implicit latent sound representatives to facilitate sound localization. More importantly, the trained DeepEar model can adapt to new environments with a minimum amount of extra training data. The experiment results show that DeepEar substantially outperforms the state-of-the-art binaural deep learning approach by a large margin in terms of sound detection accuracy and azimuth estimation error.
引用
收藏
页码:359 / 375
页数:17
相关论文
共 50 条
  • [1] DeepEar: Sound Localization with Binaural Microphones
    Yang, Qiang
    Zheng, Yuanqing
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2022), 2022, : 960 - 969
  • [2] Sound localization and binaural mechanisms
    Blauert, J
    COMPUTATIONAL MODELS OF AUDITORY FUNCTION, 2001, 312 : 79 - 81
  • [3] Extraction of Sound Sources in Specified Area Using Binaural Microphones
    Kawamura A.
    Uchida A.
    IEEJ Transactions on Electronics, Information and Systems, 144 (03): : 267 - 275
  • [4] Localizing concurrent sound sources with binaural microphones: A simulation study
    Orr, Jakeh
    Ebel, William
    Gai, Yan
    HEARING RESEARCH, 2023, 439
  • [5] Microphones' Directivity for the Localization of Sound Sources
    Rizzo, Piervincenzo
    Tajari, Mahdi
    Spada, Antonino
    UNATTENDED GROUND, SEA, AND AIR SENSOR TECHNOLOGIES AND APPLICATIONS XIII, 2011, 8046
  • [6] A probabilistic model for binaural sound localization
    Willert, Volker
    Eggert, Julian
    Adamy, Juergen
    Stahl, Raphael
    Koerner, Edgar
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2006, 36 (05): : 982 - 994
  • [7] Binaural localization for a mobile sound source
    Kumon M.
    Uozumi S.
    Journal of Biomechanical Science and Engineering, 2011, 6 (01): : 26 - 39
  • [8] Sound localization cues of binaural hearing
    Paulus, E
    LARYNGO-RHINO-OTOLOGIE, 2003, 82 (04) : 240 - 248
  • [9] Localization Estimation of Sound Source by Microphones Array
    Fan, Jing
    Luo, Qian
    Ma, Ding
    2010 SYMPOSIUM ON SECURITY DETECTION AND INFORMATION PROCESSING, 2010, 7 : 312 - 317
  • [10] Adaptive sound source localization by two microphones
    Li, D
    Levinson, SE
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4033 - 4033