Learning a precedence effect-like weighting function for the generalized cross-correlation framework

被引:20
作者
Wilson, Kevin W. [1 ]
Darrell, Trevor [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2006年 / 14卷 / 06期
关键词
acoustic arrays; array signal processing; delay estimation; direction of arrival estimation; speech processing;
D O I
10.1109/TASL.2006.872601
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech source localization in reverberant environments has proved difficult for automated microphone array systems. Because of its nonstationary nature, certain features observable in the reverberant speech signal, such as sudden increases in audio energy, provide cues to indicate time-frequency regions that are particularly useful for audio localization. We exploit-these cues by learning a mapping from reverberated signal spectrograms to localization precision using ridge regression. Using the learned mappings in the generalized cross-correlation framework, we demonstrate improved localization performance. Additionally, the resulting mappings exhibit behavior consistent with the well-known precedence effect from psychoacoustic studies.
引用
收藏
页码:2156 / 2164
页数:9
相关论文
共 24 条
[1]   IMAGE METHOD FOR EFFICIENTLY SIMULATING SMALL-ROOM ACOUSTICS [J].
ALLEN, JB ;
BERKLEY, DA .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 (04) :943-950
[2]  
Ben-Reuven E., 2003, ADV NEURAL INFORM PR, V15, P1229
[3]   BREAKDOWN OF ECHO SUPPRESSION IN THE PRECEDENCE EFFECT [J].
CLIFTON, RK .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1987, 82 (05) :1834-1835
[4]  
Dibiase J. H., 2001, MICROPHONE ARRAYS SI
[5]   ARTICULATION TESTING METHODS [J].
EGAN, JP .
LARYNGOSCOPE, 1948, 58 (09) :955-991
[6]   Source localization in complex listening situations: Selection of binaural cues based on interaural coherence [J].
Faller, C ;
Merimaa, J .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2004, 116 (05) :3075-3089
[7]   COMBINED EVALUATION OF INTERAURAL TIME AND INTENSITY DIFFERENCES - PSYCHOACOUSTIC RESULTS AND COMPUTER MODELING [J].
GAIK, W .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1993, 94 (01) :98-110
[8]  
Golub G. H., 1996, MATRIX COMPUTATIONS
[9]  
GOODRIDGE SG, 1997, THESIS N CAROLINA ST
[10]   Source localization in reverberant environments: Modeling and statistical analysis [J].
Gustafsson, T ;
Rao, BD ;
Trivedi, M .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (06) :791-803