Multi-party focus of attention recognition in meetings from head pose and multimodal contextual cues

被引:3
作者
Ba, Sileye O. [1 ,2 ]
Odobez, Jean-Marc [1 ,2 ]
机构
[1] IDIAP Res Inst, 1920 Martigny, Lausanne, Switzerland
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
来源
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 | 2008年
关键词
visual focus of attention; multi-modal; contextual cues; head pose; meeting analysis;
D O I
10.1109/ICASSP.2008.4518086
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents investigations on visual focus of attention (VFOA) recognition in meetings from audio-visual perceptual cues. Rather than independently recognizing the VFOA of each participant from his own head pose, we propose to recognize participants' VFOA jointly in order to introduce context dependent interaction models that relates to group activity and the social dynamics of communication. To this end, we designed an input-output hidden Markov model (IOHMM), whose hidden states are the joint VFOA of all participants, and whose main observations are the head poses. Interaction models are introduced in the form of contextual cues that affect the temporal evolution of the joint VFOA sequence, allowing us to model group dynamics that accounts for people's tendency to share the same focus, or to have their VFOA driven by contextual cues such as slide activity or the participant speaking activity. The model is rigorously evaluated on a publicly available dataset of 4 real meetings of 23min on average, showing an overall 10% relative performance increase w.r.t. the independent recognition case.
引用
收藏
页码:2221 / +
页数:2
相关论文
共 7 条
[1]  
[Anonymous], READINGS SPEECH RECO
[2]  
BA SO, 2005, P ACM ICMI MMMP, P9
[3]  
KULYK O, 2006, LNCS, V3869, P150
[4]  
ODOBEZ JM, 2007, P INT C MULT EXP
[5]  
Otsuka K., 2006, P INT C MULT EXP
[6]  
Otsuka Kazuhiro, 2005, PROC 7 ACM INT C MUL, P191
[7]   Modeling focus of attention for meeting indexing based on multiple cues [J].
Stiefelhagen, R ;
Yang, J ;
Waibel, A .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2002, 13 (04) :928-938