Sound source localization using deep learning models

被引：86

作者：

Yalta N. ^{[1
]}

Nakadai K. ^{[2
]}

Ogata T. ^{[1
,3
]}

机构：

[1] Intermedia Art and Science Department, Waseda University, 3-4-1 Ohkubo, Shinjuku, 169-8555, Tokyo

[2] Honda Research Institute Japan Co., Ltd, Tokyo Institute of Technology, 8-1 Honcho, Wako, 351-0188, Saitama

[3] Faculty of Science and Engineering, Waseda University, 3-4-1 Ohkubo, Shinjuku, 169-8555, Tokyo

来源：

| 2017年 / Fuji Technology Press卷 / 29期

关键词：

Deep learning; Deep residual networks; Sound source localization;

D O I：

10.20965/jrm.2017.p0037

中图分类号：

学科分类号：

摘要：

This study proposes the use of a deep neural network to localize a sound source using an array of microphones in a reverberant environment. During the last few years, applications based on deep neural networks have performed various tasks such as image classification or speech recognition to levels that exceed even human capabilities. In our study, we employ deep residual networks, which have recently shown remarkable performance in image classification tasks even when the training period is shorter than that of other models. Deep residual networks are used to process audio input similar to multiple signal classification (MUSIC) methods. We show that with end-to-end training and generic preprocessing, the performance of deep residual networks not only surpasses the block level accuracy of linear models on nearly clean environments but also shows robustness to challenging conditions by exploiting the time delay on power information. © 2017, Fuji Technology Press. All rights reserved.

引用

页码：37 / 48

页数：11

共 50 条

[31] Sound Source Localization Using Piezoelectric Acoustic Metasurfaces
Gu, Jin-Cheng
Lin, Wei
Kan, Cai-Xia
ACOUSTICS AUSTRALIA, 2020, 48 (03) : 455 - 461
[32] DISCRIMINATIVE MULTIPLE SOUND SOURCE LOCALIZATION BASED ON DEEP NEURAL NETWORKS USING INDEPENDENT LOCATION MODEL
Takeda, Ryu
Komatani, Kazunori
2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 603 - 609
[33] Heart Sound Classification Using Wavelet Analysis Approaches and Ensemble of Deep Learning Models
Lee, Jin-A
Kwak, Keun-Chang
APPLIED SCIENCES-BASEL, 2023, 13 (21):
[34] Underwater Sound Source Range Estimation Based on Deep Learning
Qu, Yuchen
Huang, Yiqian
Ren, Xinmin
Chen, Yang
Han, Tianshun
OCEANS 2024 - SINGAPORE, 2024,
[35] Performing a Research Study Using Open-Source Deep Learning Models
Kim, Hyungjin
KOREAN JOURNAL OF RADIOLOGY, 2024, 25 (03) : 217 - 219
[36] Multitask Learning of Time-Frequency CNN for Sound Source Localization
Pang, Cheng
Liu, Hong
Li, Xiaofei
IEEE ACCESS, 2019, 7 : 40725 - 40737
[37] Toward learning robust contrastive embeddings for binaural sound source localization
Tang, Duowei
Taseska, Maja
van Waterschoot, Toon
FRONTIERS IN NEUROINFORMATICS, 2022, 16
[38] A weighted MVDR beamformer based on SVM learning for sound source localization
Salvati, Daniele
Drioli, Carlo
Foresti, Gian Luca
PATTERN RECOGNITION LETTERS, 2016, 84 : 15 - 21
[39] Sound Spectrum Detection using Deep Learning
Ozdes, Merve
Severoglu, Batuhan Mert
2019 SCIENTIFIC MEETING ON ELECTRICAL-ELECTRONICS & BIOMEDICAL ENGINEERING AND COMPUTER SCIENCE (EBBT), 2019,
[40] Sound source localization for source inside a structure using Ac-CycleGAN
Kita, Shunsuke
Park, Choong Sik
Kajikawa, Yoshinobu
JOURNAL OF SOUND AND VIBRATION, 2024, 591

← 1 2 3 4 5 →