A New Regional Localization Method for Indoor Sound Source Based on Convolutional Neural Networks

被引:20
作者
Zhang, Xiaomeng [1 ]
Sun, Hao [1 ]
Wang, Shuopeng [1 ]
Xu, Jing [1 ]
机构
[1] Hebei Univ Technol, Sch Artificial Intelligence, Tianjin 300130, Peoples R China
来源
IEEE ACCESS | 2018年 / 6卷
基金
中国国家自然科学基金;
关键词
Sound source localization; machine learning; spectrogram; CNN; ACOUSTIC SOURCE LOCALIZATION; OF-ARRIVAL ESTIMATION; IDENTIFICATION; ENVIRONMENT; SPEAKERS;
D O I
10.1109/ACCESS.2018.2883341
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
At present, the sound source localization methods based on microphone arrays can be roughly classified into three categories: the controllable beamforming technology based on a maximum output power, the high-resolution spectrogram estimation technique, and the sound source localization technique based on time difference of sound. However, an existing localization technology in unstructured indoor environment lacks of localization accuracy and adaptability. In some practical situations, the location of sound source is limited to predefined areas. In this paper, we propose a research method of source region location system based on convolutional neural networks (CNNs). Based on the characteristics of weighted values of CNN, we realize the regional of indoor single sound sources transforming the sound source signals into grammar diagrams and then inputting them into the CNN. The whole process is based on the characteristics of weighted values of CNN. Finally, this paper completes the training and testing for CNN by using the Tensorflow framework. Simulation experiments on the test sets show the effectiveness of the proposed method.
引用
收藏
页码:72073 / 72082
页数:10
相关论文
共 53 条
  • [1] Prediction of human-Bacillus anthracis protein-protein interactions using multi-layer neural network
    Ahmed, Ibrahim
    Witbooi, Peter
    Christoffels, Alan
    [J]. BIOINFORMATICS, 2018, 34 (24) : 4159 - 4164
  • [2] Text-independent speaker identification using Radon and discrete cosine transforms based features from speech spectrogram
    Ajmera, Pawan K.
    Jadhav, Dattatray V.
    Holambe, Raghunath S.
    [J]. PATTERN RECOGNITION, 2011, 44 (10-11) : 2749 - 2759
  • [3] Reduction of noise in speech signals through image processing using the spectrogram
    Graduate School of Science and Technology, Meijo University, 1-501 Shiogamaguchi, Tempaku-ku, Nagoya, 468-8502, Japan
    不详
    [J]. IEEJ Trans. Electron. Inf. Syst., 2006, 12 (1483-1489+10): : 1483 - 1489+10
  • [4] Classification of lung sounds using convolutional neural networks
    Aykanat, Murat
    Kilic, Ozkan
    Kurt, Bahar
    Saryal, Sevgi
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2017,
  • [5] Auditory-Inspired Morphological Processing of Speech Spectrograms: Applications in Automatic Speech Recognition and Speech Enhancement
    Cadore, Joyner
    Valverde-Albacete, Francisco J.
    Gallardo-Antolin, Ascension
    Pelaez-Moreno, Carmen
    [J]. COGNITIVE COMPUTATION, 2013, 5 (04) : 426 - 441
  • [6] Acoustic vector sensor: reviews and future perspectives
    Cao, Jiuwen
    Liu, Jun
    Wang, Jianzhong
    Lai, Xiaoping
    [J]. IET SIGNAL PROCESSING, 2017, 11 (01) : 1 - 9
  • [7] A Measure Based on Beamforming Power for Evaluation of Sound Field Reproduction Performance
    Chang, Ji-Ho
    Jeong, Cheol-Ho
    [J]. APPLIED SCIENCES-BASEL, 2017, 7 (03):
  • [8] Acoustic Source Localization Using LS-SVMs Without Calibration of Microphone Arrays
    Chen, Huawei
    Ser, Wee
    [J]. ISCAS: 2009 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-5, 2009, : 1863 - 1866
  • [9] Direction Finding for Transient Acoustic Source Based on Biased TDOA Measurement
    Cui, Xunxue
    Yu, Kegen
    Lu, Songsheng
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2016, 65 (11) : 2442 - 2453
  • [10] Probabilistic 3-D Mapping of Sound-Emitting Structures Based on Acoustic Ray Casting
    Even, Jani
    Furrer, Jonas
    Morales, Yoichi
    Ishi, Carlos Toshinori
    Hagita, Norihiro
    [J]. IEEE TRANSACTIONS ON ROBOTICS, 2017, 33 (02) : 333 - 345