Perspectives on microphone array processing including sparse recovery, ray space analysis, and neural networks

被引:2
作者
Jin, Craig T. [1 ]
Yu, Shiduo [1 ]
Antonacci, Fabio [2 ]
Arti, Augusto S. [2 ]
机构
[1] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW, Australia
[2] Politecn Milan, Dipartimento Elettron & Informaz, Milan, Italy
关键词
Microphone array; Sparse recovery; Ray space; Convolutional neural networks; OF-ARRIVAL ESTIMATION; LOCALIZATION; ESPRIT;
D O I
10.1250/ast.41.308
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Hands-free audio services supporting speech communication are playing an increasingly ubiquitous and foundational role in everyday life as services for the home and work become more automated, interactive and robotic. People will speak their instructions (e.g. Siri) to control and interact with their environment. This makes it an exciting time for acoustics engineering because the demands on microphone array performance are rapidly increasing. The microphone arrays are expected to work at increasing distances in noisy and reverberant situations; they are expected to record not just the sound content, but also the sound field; they are expected to work in multi-talker situations and even on moving, robotic platforms. Audio technology is currently undergoing rapid change in which it is becoming feasible, from both a cost and hardware point-of-view, to incorporate multiple and distributed microphone arrays with hundreds or even thousands of microphones within a built environment. In this review paper, we consider microphone array signal processing from two relatively recent vantage points: sparse recovery and ray space analysis. To a lesser extent, we also consider neural networks. We present the principles underlying each method. We consider the advantages and disadvantages of the approaches and also present possible methods to integrate these techniques.
引用
收藏
页码:308 / 317
页数:10
相关论文
共 34 条
  • [1] Adavanne S, 2018, EUR SIGNAL PR CONF, P1462, DOI 10.23919/EUSIPCO.2018.8553182
  • [2] Asaei Afsaneh, 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), P1439, DOI 10.1109/ICASSP.2014.6853835
  • [3] Benesty J., 2008, MICROPHONE ARRAY SIG
  • [4] The Ray Space Transform: A New Framework for Wave Field Processing
    Bianchi, Lucio
    Antonacci, Fabio
    Sarti, Augusto
    Tubaro, Stefano
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2016, 64 (21) : 5696 - 5706
  • [5] Chakrabarty S, 2017, IEEE WORK APPL SIG, P136, DOI 10.1109/WASPAA.2017.8170010
  • [6] Comanducc L., 2018, 2018 INT WORKSH AC S
  • [7] Sparse solutions to linear inverse problems with multiple measurement vectors
    Cotter, SF
    Rao, BD
    Engan, K
    Kreutz-Delgado, K
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2005, 53 (07) : 2477 - 2488
  • [8] Iteratively Reweighted Least Squares Minimization for Sparse Recovery
    Daubechies, Ingrid
    Devore, Ronald
    Fornasier, Massimo
    Guentuerk, C. Sinan
    [J]. COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS, 2010, 63 (01) : 1 - 38
  • [9] Spherical Harmonic Signal Covariance and Sound Field Diffuseness
    Epain, Nicolas
    Jin, Craig T.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (10) : 1796 - 1807
  • [10] GarciaFrias J., 2007, P 2007 DAT COMPR C D