A multidimensional dynamic time warping algorithm for efficient multimodal fusion of asynchronous data streams

被引:47
作者
Woellmer, Martin [1 ]
Al-Hames, Marc [1 ]
Eyben, Florian [1 ]
Schuller, Bjoern [1 ]
Rigoll, Gerhard [1 ]
机构
[1] Tech Univ Munich, Inst Human Machine Commun, D-80290 Munich, Germany
基金
芬兰科学院;
关键词
Dynamic time warping; Multimodal data fusion; Asynchronous hidden Markov model; PROGRAMMING ALGORITHM; RECOGNITION; SPEECH; INTERFACE; ONLINE; MODELS; FACE;
D O I
10.1016/j.neucom.2009.08.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To overcome the computational complexity of the asynchronous hidden Markov model (AHMM), we present a novel multidimensional dynamic time warping (DTW) algorithm for hybrid fusion of asynchronous data. We show that our newly introduced multidimensional DTW concept requires significantly less decoding time while providing the same data fusion flexibility as the AHMM. Thus, it can be applied in a wide range of real-time multimodal classification tasks. Optimally exploiting mutual information during decoding even if the input streams are not synchronous, our algorithm outperforms late and early fusion techniques in a challenging bimodal speech and gesture fusion experiment. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:366 / 380
页数:15
相关论文
共 77 条
[11]  
BENGIO S, 2003, ADV NEURAL INFORM PR, V15
[12]  
Bengio Y., 1995, Advances in Neural Information Processing Systems 7, P427
[13]   Face recognition technology: Security versus privacy [J].
Bowyer, KW .
IEEE TECHNOLOGY AND SOCIETY MAGAZINE, 2004, 23 (01) :9-20
[14]  
Bregler C., 1993, P INT C AC SPEECH SI, P557
[15]   Audio-visual integration in multimodal communication [J].
Chen, T ;
Rao, RR .
PROCEEDINGS OF THE IEEE, 1998, 86 (05) :837-852
[16]  
CHEYER A, 1999, CHI 99, P1
[17]  
COHEN PR, 1989, HUMAN FACTORS COMPUT, V20, P227
[18]  
de Mello RF, 2008, LECT NOTES ARTIF INT, V5249, P23, DOI 10.1007/978-3-540-88190-2_8
[19]  
DUCHNOWSKI P, 1994, INT C SPOK LANG PROC
[20]  
DUSAN S, 2003, P EUROSPEECH GEN SWI, P2225