A multidimensional dynamic time warping algorithm for efficient multimodal fusion of asynchronous data streams

被引:47
作者
Woellmer, Martin [1 ]
Al-Hames, Marc [1 ]
Eyben, Florian [1 ]
Schuller, Bjoern [1 ]
Rigoll, Gerhard [1 ]
机构
[1] Tech Univ Munich, Inst Human Machine Commun, D-80290 Munich, Germany
基金
芬兰科学院;
关键词
Dynamic time warping; Multimodal data fusion; Asynchronous hidden Markov model; PROGRAMMING ALGORITHM; RECOGNITION; SPEECH; INTERFACE; ONLINE; MODELS; FACE;
D O I
10.1016/j.neucom.2009.08.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To overcome the computational complexity of the asynchronous hidden Markov model (AHMM), we present a novel multidimensional dynamic time warping (DTW) algorithm for hybrid fusion of asynchronous data. We show that our newly introduced multidimensional DTW concept requires significantly less decoding time while providing the same data fusion flexibility as the AHMM. Thus, it can be applied in a wide range of real-time multimodal classification tasks. Optimally exploiting mutual information during decoding even if the input streams are not synchronous, our algorithm outperforms late and early fusion techniques in a challenging bimodal speech and gesture fusion experiment. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:366 / 380
页数:15
相关论文
共 77 条
[1]  
Ablassmeier M, 2007, 2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, P2250
[2]  
ALHAMES M, 2006, IEEE INT C AC SPEECH, P757
[3]  
ALTHOFF F, 2001, P 2001 WORKSH PERC U, P1
[4]  
[Anonymous], P IEEE INT C AC SPEE
[5]  
[Anonymous], P 9 INT C MULT INT I
[6]  
[Anonymous], 1980, P 7 ANN C COMP GRAPH, DOI [DOI 10.1145/800250.807503, 10.1145/965105.807503, DOI 10.1145/965105.807503]
[7]  
ARSIC D, 2005, ICIP 2005, P606
[8]  
Arsic D, 2007, 2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, P2018
[9]   Fusion of face and speech data for person identity verification [J].
Ben-Yacoub, S ;
Abdeljaoued, Y ;
Mayoraz, E .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (05) :1065-1074
[10]  
Bengio S, 2003, LECT NOTES COMPUT SC, V2688, P770