MultiMediate : Multi-modal Group Behaviour Analysis for Artificial Mediation

被引:12
作者
Mueller, Philipp [1 ]
Dietz, Michael [2 ]
Schiller, Dominik [2 ]
Thomas, Dominike [3 ]
Zhang, Guanhua [3 ]
Gebhard, Patrick [1 ]
Andre, Elisabeth [2 ]
Bulling, Andreas [3 ]
机构
[1] DFKI GmbH, Saarbrucken, Germany
[2] Augsburg Univ, Augsburg, Germany
[3] Univ Stuttgart, Stuttgart, Germany
来源
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年
基金
欧洲研究理事会;
关键词
challenge; dataset; eye contact detection; next speaker prediction; TURN-TAKING; PREDICTION; GAZE;
D O I
10.1145/3474085.3479219
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Artificial mediators are promising to support human group conversations but at present their abilities are limited by insufficient progress in group behaviour analysis. The MultiMediate challenge addresses, for the first time, two fundamental group behaviour analysis tasks in well-defined conditions: eye contact detection and next speaker prediction. For training and evaluation, MultiMediate makes use of the MPIIGroupInteraction dataset consisting of 22 three- to four-person discussions as well as of an unpublished test set of six additional discussions. This paper describes the MultiMediate challenge and presents the challenge dataset including novel fine-grained speaking annotations that were collected for the purpose of MultiMediate. Furthermore, we present baseline approaches and ablation studies for both challenge tasks.
引用
收藏
页码:4878 / 4882
页数:5
相关论文
共 43 条
  • [1] Multiperson Visual Focus of Attention from Head Pose and Meeting Contextual Cues
    Ba, Sileye O.
    Odobez, Jean-Marc
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (01) : 101 - 116
  • [2] Balaam M, 2011, 29TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, P867
  • [3] OpenFace 2.0: Facial Behavior Analysis Toolkit
    Baltrusaitis, Tadas
    Zadeh, Amir
    Lim, Yao Chong
    Morency, Louis-Philippe
    [J]. PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, : 59 - 66
  • [4] Next Speakers Plan Their Turn Early and Speak after Turn-Final "Go-Signals"
    Barthel, Mathias
    Meyer, Antje S.
    Levinson, Stephen C.
    [J]. FRONTIERS IN PSYCHOLOGY, 2017, 8
  • [5] Prediction of the Leadership Style of an Emergent Leader Using Audio and Visual Nonverbal Features
    Beyan, Cigdem
    Capozzi, Francesca
    Becchio, Cristina
    Murino, Vittorio
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (02) : 441 - 456
  • [6] Birmingham Chris, 2020, ARXIV200204671
  • [7] Bohus D., 2010, INT C MULTIMODAL INT, P1, DOI DOI 10.1145/1891903.1891910
  • [8] Tracking the Leader: Gaze Behavior in Group Interactions
    Capozzi, Francesca
    Beyan, Cigdem
    Pierro, Antonio
    Koul, Atesh
    Murino, Vittorio
    Livi, Stefano
    Bayliss, Andrew P.
    Ristic, Jelena
    Becchio, Cristina
    [J]. ISCIENCE, 2019, 16 : 242 - +
  • [9] Carletta J, 2005, LECT NOTES COMPUT SC, V3869, P28
  • [10] Coordinating Utterances During Turn-Taking: The Role of Prediction, Response Preparation, and Articulation
    Corps, Ruth E.
    Gambi, Chiara
    Pickering, Martin J.
    [J]. DISCOURSE PROCESSES, 2018, 55 (02) : 230 - 240