Automatic analysis of multimodal group actions in meetings

被引:185
作者
McCowan, I [1 ]
Gatica-Perez, D [1 ]
Bengio, S [1 ]
Lathoud, G [1 ]
Barnard, M [1 ]
Zhang, D [1 ]
机构
[1] IDIAP Res Inst, CH-1920 Martigny, Switzerland
关键词
statistical models; multimedia applications and numerical signal processing; computer conferencing; asynchronous interaction;
D O I
10.1109/TPAMI.2005.49
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates the recognition of group actions in meetings. A framework is employed in which group actions result from the interactions of the individual participants. The group actions are modeled using different HMM-based approaches, where the observations are provided by a set of audiovisual features monitoring the actions of individuals. Experiments demonstrate the importance of taking interactions into account in modeling the group actions. It is also shown that the visual modality contains useful information, even for predominantly audio-based events, motivating a multimodal approach to meeting analysis.
引用
收藏
页码:305 / 317
页数:13
相关论文
共 62 条
[1]  
Bales R.F., 1979, SYMLOG SYSTEM MULTIP
[2]  
Bales R.F., 1950, INTERACTION PROCESS
[3]  
BENGIO S, 2003, ADV NEURAL INFORMATI, V15
[4]  
BOBICK A, 1999, PRESENCE TELEOPERATO, V8
[5]  
Boreczky JS, 1998, INT CONF ACOUST SPEE, P3741, DOI 10.1109/ICASSP.1998.679697
[6]  
Brand M., 1996, 405 MIT MED LAB VIS
[7]  
BRAND M, 1997, P IEEE
[8]  
COLLOBERT R, 2002, 46 IDIARPRR
[9]  
CUTLER R, 2002, P ACM MULTIMEDIA C
[10]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38