Configuration-Invariant Sound Localization Technique Using Azimuth-Frequency Representation and Convolutional Neural Networks

被引:2
|
作者
Chun, Chanjun [1 ]
Jeon, Kwang Myung [2 ]
Choi, Wooyeol [3 ]
机构
[1] Korea Inst Civil Engn & Bldg Technol KICT, Future Infrastruct Res Ctr, Goyang 10223, South Korea
[2] IntFlow Co Ltd, Gwangju 61080, South Korea
[3] Chosun Univ, Dept Comp Engn, Gwangju 61452, South Korea
基金
新加坡国家研究基金会;
关键词
azimuth-frequency representation; configuration-invariant; convolutional neural network (CNN); sound localization;
D O I
10.3390/s20133768
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Deep neural networks (DNNs) have achieved significant advancements in speech processing, and numerous types of DNN architectures have been proposed in the field of sound localization. When a DNN model is deployed for sound localization, a fixed input size is required. This is generally determined by the number of microphones, the fast Fourier transform size, and the frame size. if the numbers or configurations of the microphones change, the DNN model should be retrained because the size of the input features changes. in this paper, we propose a configuration-invariant sound localization technique using the azimuth-frequency representation and convolutional neural networks (CNNs). the proposed CNN model receives the azimuth-frequency representation instead of time-frequency features as the input features. the proposed model was evaluated in different environments from the microphone configuration in which it was originally trained. for evaluation, single sound source is simulated using the image method. Through the evaluations, it was confirmed that the localization performance was superior to the conventional steered response power phase transform (SRP-PHAT) and multiple signal classification (MUSIC) methods.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [21] Burst Image Deblurring Using Permutation Invariant Convolutional Neural Networks
    Aittala, Miika
    Durand, Fredo
    COMPUTER VISION - ECCV 2018, PT VIII, 2018, 11212 : 748 - 764
  • [22] Eye pupil localization algorithm using convolutional neural networks
    Jun Ho Choi
    Kang Il Lee
    Byung Cheol Song
    Multimedia Tools and Applications, 2020, 79 : 32563 - 32574
  • [23] Eye pupil localization algorithm using convolutional neural networks
    Choi, Jun Ho
    Lee, Kang Il
    Song, Byung Cheol
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (43-44) : 32563 - 32574
  • [24] Angiodysplasia detection and localization using deep convolutional neural networks
    Shvets, Alexey A.
    Iglovikov, Vladimir I.
    Rakhlin, Alexander
    Kalinin, Alexandr A.
    2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 612 - 617
  • [25] Improving Fingerprint Indoor Localization Using Convolutional Neural Networks
    Sun, Danshi
    Wei, Erhu
    Yang, Li
    Xu, Shiyi
    IEEE ACCESS, 2020, 8 : 193396 - 193411
  • [26] Detection and Localization of Ultrasound Scatterers Using Convolutional Neural Networks
    Youn, Jihwan
    Ommen, Martin Lind
    Stuart, Matthias Bo
    Thomsen, Erik Vilain
    Larsen, Niels Bent
    Jensen, Jorgen Arendt
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2020, 39 (12) : 3855 - 3867
  • [27] Localization of Demyelinating Plaques in MRI using Convolutional Neural Networks
    Stasiak, Bartlomiej
    Tarasiuk, Pawel
    Michalska, Izabela
    Tomczyk, Arkadiusz
    Szczepaniak, Piotr S.
    PROCEEDINGS OF THE 10TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES, VOL 2: BIOIMAGING, 2017, : 55 - 64
  • [28] Simultaneous Object Detection and Localization using Convolutional Neural Networks
    Zahra Ouadiay, Fatima
    Bouftaih, Hamza
    Bouyakhf, El Houssine
    Majid Himmi, M.
    2018 INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND COMPUTER VISION (ISCV2018), 2018,
  • [29] Localization Convolutional Neural Networks Using Angle of Arrival Images
    Comiter, Marcus
    Kung, H. T.
    2018 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2018,
  • [30] Voice Command Recognition Using Biologically Inspired Time-Frequency Representation and Convolutional Neural Networks
    Sharan, Roneel V.
    Berkovsky, Shlomo
    Liu, Sidong
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 998 - 1001