BROADBAND DOA ESTIMATION USING CONVOLUTIONAL NEURAL NETWORKS TRAINED WITH NOISE SIGNALS

被引:0
作者
Chakrabarty, Soumitro [1 ]
Habets, Emanuel A. P. [1 ]
机构
[1] Int Audio Labs Erlangen, Wolfsmantel 33, D-91058 Erlangen, Germany
来源
2017 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA) | 2017年
关键词
source localization; convolution neural networks; supervised learning; DOA estimation; SOURCE LOCALIZATION; LOCATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A convolution neural network (CNN) based classification method for broadband DOA estimation is proposed, where the phase component of the short-time Fourier transform coefficients of the received microphone signals are directly fed into the CNN and the features required for DOA estimation are learned during training. Since only the phase component of the input is used, the CNN can be trained with synthesized noise signals, thereby making the preparation of the training data set easier compared to using speech signals. Through experimental evaluation, the ability of the proposed noise trained CNN framework to generalize to speech sources is demonstrated. In addition, the robustness of the system to noise, small perturbations in microphone positions, as well as its ability to adapt to different acoustic conditions is investigated using experiments with simulated and real data.
引用
收藏
页码:136 / 140
页数:5
相关论文
共 18 条
  • [1] Abdel-Hamid O, 2012, INT CONF ACOUST SPEE, P4277, DOI 10.1109/ICASSP.2012.6288864
  • [2] [Anonymous], 2016, Room Impulse Response (RIR) generator
  • [3] [Anonymous], MICROPHONE ARRAY SIG
  • [4] [Anonymous], 2014, J MACH LEARN RES
  • [5] Brandstein MS, 1997, INT CONF ACOUST SPEE, P375, DOI 10.1109/ICASSP.1997.599651
  • [6] Hadad E, 2014, 2014 14TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), P313, DOI 10.1109/IWAENC.2014.6954309
  • [7] Deep Neural Networks for Acoustic Modeling in Speech Recognition
    Hinton, Geoffrey
    Deng, Li
    Yu, Dong
    Dahl, George E.
    Mohamed, Abdel-rahman
    Jaitly, Navdeep
    Senior, Andrew
    Vanhoucke, Vincent
    Patrick Nguyen
    Sainath, Tara N.
    Kingsbury, Brian
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 82 - 97
  • [8] Real-time passive source localization: A practical linear-correction least-squares approach
    Huang, YT
    Benesty, J
    Elko, GW
    Mersereau, RM
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (08): : 943 - 956
  • [9] Kingma D.P., 2015, ICLR, P1
  • [10] GENERALIZED CORRELATION METHOD FOR ESTIMATION OF TIME-DELAY
    KNAPP, CH
    CARTER, GC
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1976, 24 (04): : 320 - 327