Investigation of Small Group Social Interactions Using Deep Visual Activity-Based Nonverbal Features

被引：16

作者：

Beyan, Cigdem ^{[1
]}

Shahid, Muhammad ^{[1
,2
]}

Murino, Vittorio ^{[1
,3
]}

机构：

[1] Ist Italiano Tecnol, Pattern Anal & Comp Vis, Genoa, Italy

[2] Univ Genoa, Elect Elect & Telecommun Engn & Naval Architectur, Genoa, Italy

[3] Univ Verona, Dept Comp Sci, Verona, Italy

来源：

PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18) | 2018年

关键词：

social interactions; small groups; meetings; visual activity; nonverbal behavior; deep neural network; feature encoding; AUTOMATIC-ANALYSIS; RECOGNITION;

D O I：

10.1145/3240508.3240685

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Understanding small group face-to-face interactions is a prominent research problem for social psychology while the automatic realization of it recently became popular in social computing. This is mainly investigated in terms of nonverbal behaviors, as they are one of the main facet of communication. Among several multi-modal nonverbal cues, visual activity is an important one and its sufficiently good performance can be crucial for instance, when the audio sensors are missing. The existing visual activity-based nonverbal features, which are all hand-crafted, were able to perform well enough for some applications while did not perform well for some other problems. Given these observations, we claim that there is a need of more robust feature representations, which can be learned from data itself. To realize this, we propose a novel method, which is composed of optical flow computation, deep neural network-based feature learning, feature encoding and classification. Additionally, a comprehensive analysis between different feature encoding techniques is also presented. The proposed method is tested on three research topics, which can be perceived during small group interactions i.e. meetings: i) emergent leader detection, ii) emergent leadership style prediction, and iii) high/low extraversion classification. The proposed method shows (significantly) better results not only as compared to the state of the art visual activity based-nonverbal features but also when the state of the art visual activity based-nonverbal features are combined with other audio-based and video-based nonverbal features.

引用

页码：311 / 319

页数：9

共 64 条

[41] Kalimeri K, 2012, ICMI '12: PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, P23
[42] Large-scale Video Classification with Convolutional Neural Networks
Karpathy, Andrej
Toderici, George
Shetty, Sanketh
Leung, Thomas
Sukthankar, Rahul
Fei-Fei, Li
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1725 - 1732
[43] ImageNet Classification with Deep Convolutional Neural Networks
Krizhevsky, Alex
Sutskever, Ilya
Hinton, Geoffrey E.
[J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90
[44] Connecting Meeting Behavior with Extraversion-A Systematic Study
Lepri, Bruno
Subramanian, Ramanathan
Kalimeri, Kyriaki
Staiano, Jacopo
Pianesi, Fabio
Sebe, Nicu
[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2012, 3 (04) : 443 - 455
[45] Lepri B, 2010, LECT NOTES COMPUT SC, V6219, P140, DOI 10.1007/978-3-642-14715-9_14
[46] Automatic analysis of multimodal group actions in meetings
McCowan, I
Gatica-Perez, D
Bengio, S
Lathoud, G
Barnard, M
Zhang, D
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (03) : 305 - 317
[47] Mohammadi G., 2012, P 20 ACM INT C MULT, P789
[48] Personality Trait Classification via Co-Occurrent Multiparty Multimodal Event Discovery
Okada, Shogo
Aran, Oya
Gatica-Perez, Daniel
[J]. ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2015, : 15 - 22
[49] Otsuka Kazuhiro, 2006, P ACM CHI
[50] Porikli F., 2006, 2006 IEEE COMP SOC C, P728, DOI 10.1109/CVPR.2006.94

← 1 2 3 4 5 6 7 →