Sound source localization using deep learning models

被引:86
作者
Yalta N. [1 ]
Nakadai K. [2 ]
Ogata T. [1 ,3 ]
机构
[1] Intermedia Art and Science Department, Waseda University, 3-4-1 Ohkubo, Shinjuku, 169-8555, Tokyo
[2] Honda Research Institute Japan Co., Ltd, Tokyo Institute of Technology, 8-1 Honcho, Wako, 351-0188, Saitama
[3] Faculty of Science and Engineering, Waseda University, 3-4-1 Ohkubo, Shinjuku, 169-8555, Tokyo
关键词
Deep learning; Deep residual networks; Sound source localization;
D O I
10.20965/jrm.2017.p0037
中图分类号
学科分类号
摘要
This study proposes the use of a deep neural network to localize a sound source using an array of microphones in a reverberant environment. During the last few years, applications based on deep neural networks have performed various tasks such as image classification or speech recognition to levels that exceed even human capabilities. In our study, we employ deep residual networks, which have recently shown remarkable performance in image classification tasks even when the training period is shorter than that of other models. Deep residual networks are used to process audio input similar to multiple signal classification (MUSIC) methods. We show that with end-to-end training and generic preprocessing, the performance of deep residual networks not only surpasses the block level accuracy of linear models on nearly clean environments but also shows robustness to challenging conditions by exploiting the time delay on power information. © 2017, Fuji Technology Press. All rights reserved.
引用
收藏
页码:37 / 48
页数:11
相关论文
共 50 条
  • [41] Sound source localization using the compensation method in robot platform
    Kwon, Byoungho
    Park, Youngjin
    Park, Youn-sik
    2007 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS, VOLS 1-6, 2007, : 753 - 756
  • [42] Sound Source Localization in Median Plane using Artificial Ear
    Lee, Sangmoon
    Hwang, Sungmok
    Park, Youngjin
    Park, Youn-sik
    2008 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS, VOLS 1-4, 2008, : 224 - 228
  • [43] Towards End-to-End Acoustic Localization Using Deep Learning: From Audio Signals to Source Position Coordinates
    Manuel Vera-Diaz, Juan
    Pizarro, Daniel
    Macias-Guarasa, Javier
    SENSORS, 2018, 18 (10)
  • [44] Deep Learning-Based Speech Specific Source Localization by Using Binaural and Monaural Microphone Arrays in Hearing Aids
    Goli, Peyman
    van de Par, Steven
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1652 - 1666
  • [45] Examining Recognition of Occupants' Cooking Activity Based on Sound Data Using Deep Learning Models
    Kim, Yuhwan
    Choi, Chang-Ho
    Park, Chang-Young
    Park, Seonghyun
    BUILDINGS, 2024, 14 (02)
  • [46] Enhanced SHL Recognition Using Machine Learning and Deep Learning Models with Multi-source Data
    Li, Mengyuan
    Zhu, Jun
    Zhang, Yuanyuan
    Lu, Xiaoling
    ADJUNCT PROCEEDINGS OF THE 2023 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING & THE 2023 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTING, UBICOMP/ISWC 2023 ADJUNCT, 2023, : 505 - 510
  • [47] Deep Learning in Indoor Localization Using WiFi
    Turgut, Zeynep
    Ustebay, Serpil
    Aydin, Gulsum Zeynep Gurkas
    Sertbas, Ahmet
    INTERNATIONAL TELECOMMUNICATIONS CONFERENCE, ITELCON 2017, 2019, 504 : 101 - 110
  • [48] Using Deep Learning for Sonar Targets Localization
    Bruel, Q.
    Heitzmann, F.
    Morche, D.
    Huillery, J.
    Blanco, E.
    Bako, L.
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN SIGNAL PROCESSING AND ARTIFICIAL INTELLIGENCE, ASPAI' 2020, 2020, : 77 - 81
  • [49] A double-step grid-free method for sound source identification using deep learning
    Feng, Luoyi
    Zan, Ming
    Huang, Linsen
    Xu, Zhongming
    APPLIED ACOUSTICS, 2022, 201
  • [50] A Time-domain Unsupervised Learning Based Sound Source Localization Method
    Huang, Yankun
    Wu, Xihong
    Qu, Tianshu
    2020 IEEE 3RD INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP 2020), 2020, : 26 - 32