MultiMediate : Multi-modal Group Behaviour Analysis for Artificial Mediation

被引：12

作者：

Mueller, Philipp ^{[1
]}

Dietz, Michael ^{[2
]}

Schiller, Dominik ^{[2
]}

Thomas, Dominike ^{[3
]}

Zhang, Guanhua ^{[3
]}

Gebhard, Patrick ^{[1
]}

Andre, Elisabeth ^{[2
]}

Bulling, Andreas ^{[3
]}

机构：

[1] DFKI GmbH, Saarbrucken, Germany

[2] Augsburg Univ, Augsburg, Germany

[3] Univ Stuttgart, Stuttgart, Germany

来源：

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年

基金：

欧洲研究理事会;

关键词：

challenge; dataset; eye contact detection; next speaker prediction; TURN-TAKING; PREDICTION; GAZE;

D O I：

10.1145/3474085.3479219

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Artificial mediators are promising to support human group conversations but at present their abilities are limited by insufficient progress in group behaviour analysis. The MultiMediate challenge addresses, for the first time, two fundamental group behaviour analysis tasks in well-defined conditions: eye contact detection and next speaker prediction. For training and evaluation, MultiMediate makes use of the MPIIGroupInteraction dataset consisting of 22 three- to four-person discussions as well as of an unpublished test set of six additional discussions. This paper describes the MultiMediate challenge and presents the challenge dataset including novel fine-grained speaking annotations that were collected for the purpose of MultiMediate. Furthermore, we present baseline approaches and ablation studies for both challenge tasks.

引用

页码：4878 / 4882

页数：5

共 43 条

[1] Multiperson Visual Focus of Attention from Head Pose and Meeting Contextual Cues
Ba, Sileye O.
Odobez, Jean-Marc
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (01) : 101 - 116
[2] Balaam M, 2011, 29TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, P867
[3] OpenFace 2.0: Facial Behavior Analysis Toolkit
Baltrusaitis, Tadas
Zadeh, Amir
Lim, Yao Chong
Morency, Louis-Philippe
[J]. PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, : 59 - 66
[4] Next Speakers Plan Their Turn Early and Speak after Turn-Final "Go-Signals"
Barthel, Mathias
Meyer, Antje S.
Levinson, Stephen C.
[J]. FRONTIERS IN PSYCHOLOGY, 2017, 8
[5] Prediction of the Leadership Style of an Emergent Leader Using Audio and Visual Nonverbal Features
Beyan, Cigdem
Capozzi, Francesca
Becchio, Cristina
Murino, Vittorio
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (02) : 441 - 456
[6] Birmingham Chris, 2020, ARXIV200204671
[7] Bohus D., 2010, INT C MULTIMODAL INT, P1, DOI DOI 10.1145/1891903.1891910
[8] Tracking the Leader: Gaze Behavior in Group Interactions
Capozzi, Francesca
Beyan, Cigdem
Pierro, Antonio
Koul, Atesh
Murino, Vittorio
Livi, Stefano
Bayliss, Andrew P.
Ristic, Jelena
Becchio, Cristina
[J]. ISCIENCE, 2019, 16 : 242 - +
[9] Carletta J, 2005, LECT NOTES COMPUT SC, V3869, P28
[10] Coordinating Utterances During Turn-Taking: The Role of Prediction, Response Preparation, and Articulation
Corps, Ruth E.
Gambi, Chiara
Pickering, Martin J.
[J]. DISCOURSE PROCESSES, 2018, 55 (02) : 230 - 240

← 1 2 3 4 5 →