Direction of Arrival With One Microphone, a Few LEGOs, and Non-Negative Matrix Factorization

被引:8
作者
El Badawy, Dalia [1 ]
Dokmanic, Ivan [2 ]
机构
[1] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland
[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA
基金
瑞士国家科学基金会;
关键词
Direction-of-arrival estimation; group sparsity; monaural localization; non-negative matrix factorization; sound scattering; universal speech model; SOUND LOCALIZATION; SOURCE SEPARATION; SPECTRAL CUES; DIVERGENCE; ALGORITHMS;
D O I
10.1109/TASLP.2018.2867081
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Conventional approaches to sound source localization require at least two microphones. It is known, however, that people with unilateral hearing loss can also localize sounds. Monaural localization is possible thanks to the scattering by the head, though it hinges on learning the spectra of the various sources. We take inspiration from this human ability to propose algorithms for accurate sound source localization using a single microphone embedded in an arbitrary scattering structure. The structure modifies the frequency response of the microphone in a direction-dependent way giving each direction a signature. While knowing those signatures is sufficient to localize sources of white noise, localizing speech is much more challenging: it is an ill-posed inverse problem, which we regularize by prior knowledge in the form of learned non-negative dictionaries. We demonstrate a monaural speech localization algorithm based on non-negative matrix factorization that does not depend on sophisticated, designed scatterers. In fact, we show experimental results with ad hoc scatterers made of LEGO bricks. Even with these rudimentary structures we can accurately localize arbitrary speakers; that is, we do not need to learn the dictionary for the particular speaker to be localized. Finally, we discuss multi-source localization and the related limitations of our approach.
引用
收藏
页码:2436 / 2446
页数:11
相关论文
共 43 条
[1]   The CIPICHRTF database [J].
Algazi, VR ;
Duda, RO ;
Thompson, DM ;
Avendano, C .
PROCEEDINGS OF THE 2001 IEEE WORKSHOP ON THE APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2001, :99-102
[2]  
[Anonymous], 2015, THESIS
[3]  
[Anonymous], 2001, MATH SURVEYS MONOGRA
[4]  
[Anonymous], 2010, ARXIV
[5]  
[Anonymous], 2015, TR2015023 MITS EL RE
[6]   The bat head-related transfer function reveals binaural cues for sound localization in azimuth and elevation [J].
Aytekin, M ;
Grassi, E ;
Sahota, M ;
Moss, CF .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2004, 116 (06) :3594-3605
[7]  
Badawy D.El, 2017, P 13 INT C LAT VAR A, P489
[8]  
Blauert J., 1997, Spatial hearing: the psychophysics of human sound localization
[9]  
Boufounos P. T., 2011, P SOC PHOTO-OPT INS, V8138
[10]  
Cagli E., 2013, P IEEE WORKSH APPL S, P1