Configuration-Invariant Sound Localization Technique Using Azimuth-Frequency Representation and Convolutional Neural Networks

被引:2
|
作者
Chun, Chanjun [1 ]
Jeon, Kwang Myung [2 ]
Choi, Wooyeol [3 ]
机构
[1] Korea Inst Civil Engn & Bldg Technol KICT, Future Infrastruct Res Ctr, Goyang 10223, South Korea
[2] IntFlow Co Ltd, Gwangju 61080, South Korea
[3] Chosun Univ, Dept Comp Engn, Gwangju 61452, South Korea
基金
新加坡国家研究基金会;
关键词
azimuth-frequency representation; configuration-invariant; convolutional neural network (CNN); sound localization;
D O I
10.3390/s20133768
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Deep neural networks (DNNs) have achieved significant advancements in speech processing, and numerous types of DNN architectures have been proposed in the field of sound localization. When a DNN model is deployed for sound localization, a fixed input size is required. This is generally determined by the number of microphones, the fast Fourier transform size, and the frame size. if the numbers or configurations of the microphones change, the DNN model should be retrained because the size of the input features changes. in this paper, we propose a configuration-invariant sound localization technique using the azimuth-frequency representation and convolutional neural networks (CNNs). the proposed CNN model receives the azimuth-frequency representation instead of time-frequency features as the input features. the proposed model was evaluated in different environments from the microphone configuration in which it was originally trained. for evaluation, single sound source is simulated using the image method. Through the evaluations, it was confirmed that the localization performance was superior to the conventional steered response power phase transform (SRP-PHAT) and multiple signal classification (MUSIC) methods.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [31] Sound Events Localization and Detection Using Bio-Inspired Gammatone Filters and Temporal Convolutional Neural Networks
    Rosero, Karen
    Grijalva, Felipe
    Masiero, Bruno
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2314 - 2324
  • [32] A New Regional Localization Method for Indoor Sound Source Based on Convolutional Neural Networks
    Zhang, Xiaomeng
    Sun, Hao
    Wang, Shuopeng
    Xu, Jing
    IEEE ACCESS, 2018, 6 : 72073 - 72082
  • [33] QUATERNION CONVOLUTIONAL NEURAL NETWORKS FOR DETECTION AND LOCALIZATION OF 3D SOUND EVENTS
    Comminiello, Danilo
    Lella, Marco
    Scardapane, Simone
    Uncini, Aurelio
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 8533 - 8537
  • [34] AGRICULTURAL HARVESTER SOUND CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORKS AND SPECTROGRAMS
    Khorasani, Nioosha E.
    Thomas, Gabriel
    Balocco, Simone
    Mann, Danny
    APPLIED ENGINEERING IN AGRICULTURE, 2022, 38 (02) : 455 - 459
  • [35] Lung Sound Classification Using Snapshot Ensemble of Convolutional Neural Networks
    Truc Nguyen
    Pernkopf, Franz
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 760 - 763
  • [36] LOWLATENCY SOUND SOURCE SEPARATION USING CONVOLUTIONAL RECURRENT NEURAL NETWORKS
    Naithani, Gaurav
    Barker, Tom
    Parascandolo, Giambattista
    Bramslow, Lars
    Pontoppidan, Niels Henrik
    Virtanen, Tuomas
    2017 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2017, : 71 - 75
  • [37] Particle identification with neural networks using a rotational invariant moment representation
    Sinkus, R
    Voss, T
    NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 1997, 389 (1-2): : 160 - 162
  • [38] Particle identification with neural networks using a rotational invariant moment representation
    Tel-Aviv Univ, Tel-Aviv, Israel
    Nucl Instrum Methods Phys Res Sect A, 2 (360-368):
  • [39] Particle identification with neural networks using a rotational invariant moment representation
    Sinkus, R
    Voss, T
    NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 1997, 391 (02): : 360 - 368
  • [40] Classifying Heart Sound Recordings using Deep Convolutional Neural Networks and Mel-Frequency Cepstral Coefficients
    Rubin, Jonathan
    Abreu, Rui
    Ganguli, Anurag
    Nelaturi, Saigopal
    Matei, Ion
    Sricharan, Kumar
    2016 COMPUTING IN CARDIOLOGY CONFERENCE (CINC), VOL 43, 2016, 43 : 813 - 816