Perspectives on microphone array processing including sparse recovery, ray space analysis, and neural networks

被引：2

作者：

Jin, Craig T. ^{[1
]}

Yu, Shiduo ^{[1
]}

Antonacci, Fabio ^{[2
]}

Arti, Augusto S. ^{[2
]}

机构：

[1] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW, Australia

[2] Politecn Milan, Dipartimento Elettron & Informaz, Milan, Italy

来源：

ACOUSTICAL SCIENCE AND TECHNOLOGY | 2020年 / 41卷 / 01期

关键词：

Microphone array; Sparse recovery; Ray space; Convolutional neural networks; OF-ARRIVAL ESTIMATION; LOCALIZATION; ESPRIT;

D O I：

10.1250/ast.41.308

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Hands-free audio services supporting speech communication are playing an increasingly ubiquitous and foundational role in everyday life as services for the home and work become more automated, interactive and robotic. People will speak their instructions (e.g. Siri) to control and interact with their environment. This makes it an exciting time for acoustics engineering because the demands on microphone array performance are rapidly increasing. The microphone arrays are expected to work at increasing distances in noisy and reverberant situations; they are expected to record not just the sound content, but also the sound field; they are expected to work in multi-talker situations and even on moving, robotic platforms. Audio technology is currently undergoing rapid change in which it is becoming feasible, from both a cost and hardware point-of-view, to incorporate multiple and distributed microphone arrays with hundreds or even thousands of microphones within a built environment. In this review paper, we consider microphone array signal processing from two relatively recent vantage points: sparse recovery and ray space analysis. To a lesser extent, we also consider neural networks. We present the principles underlying each method. We consider the advantages and disadvantages of the approaches and also present possible methods to integrate these techniques.

引用

页码：308 / 317

页数：10

共 34 条

[1] Adavanne S, 2018, EUR SIGNAL PR CONF, P1462, DOI 10.23919/EUSIPCO.2018.8553182
[2] Asaei Afsaneh, 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), P1439, DOI 10.1109/ICASSP.2014.6853835
[3] Benesty J., 2008, MICROPHONE ARRAY SIG
[4] The Ray Space Transform: A New Framework for Wave Field Processing
Bianchi, Lucio
Antonacci, Fabio
Sarti, Augusto
Tubaro, Stefano
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2016, 64 (21) : 5696 - 5706
[5] Chakrabarty S, 2017, IEEE WORK APPL SIG, P136, DOI 10.1109/WASPAA.2017.8170010
[6] Comanducc L., 2018, 2018 INT WORKSH AC S
[7] Sparse solutions to linear inverse problems with multiple measurement vectors
Cotter, SF
Rao, BD
Engan, K
Kreutz-Delgado, K
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2005, 53 (07) : 2477 - 2488
[8] Iteratively Reweighted Least Squares Minimization for Sparse Recovery
Daubechies, Ingrid
Devore, Ronald
Fornasier, Massimo
Guentuerk, C. Sinan
[J]. COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS, 2010, 63 (01) : 1 - 38
[9] Spherical Harmonic Signal Covariance and Sound Field Diffuseness
Epain, Nicolas
Jin, Craig T.
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (10) : 1796 - 1807
[10] GarciaFrias J., 2007, P 2007 DAT COMPR C D

← 1 2 3 4 →