Task-independent Recognition of Communication Skills in Group Interaction Using Time-series Modeling

被引：1

作者：

Mawalim, Candy Olivia ^{[1
]}

Okada, Shogo ^{[1
]}

Nakano, Yukiko, I ^{[2
]}

机构：

[1] Japan Adv Inst Sci & Technol, 1-1 Asahidai, Nomi, Ishikawa 9231292, Japan

[2] Seikei Univ, Musashino, Tokyo, Japan

来源：

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS | 2021年 / 17卷 / 04期

基金：

日本学术振兴会;

关键词：

Multimodal analysis; time-series modeling; task independent; communication skills; group discussion; FACIAL EXPRESSION;

D O I：

10.1145/3450283

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Case studies of group discussions are considered an effective way to assess communication skills (CS). This method can help researchers evaluate participants' engagement with each other in a specific realistic context. In this article, multimodal analysis was performed to estimate CS indices using a three-task-type group discussion dataset, the MATRICS corpus. The current research investigated the effectiveness of engaging both static and time-series modeling, especially in task-independent settings. This investigation aimed to understand three main points: first, the effectiveness of time-series modeling compared to nonsequential modeling; second, multimodal analysis in a task-independent setting; and third, important differences to consider when dealing with task-dependent and task-independent settings, specifically in terms of modalities and prediction models. Several modalities were extracted (e.g., acoustics, speaking turns, linguistic-related movement, dialog tags, head motions, and face feature sets) for inferring the CS indices as a regression task. Three predictive models, including support vector regression (SVR), long short-term memory (LSTM), and an enhanced time-series model (an LSTM model with a combination of static and time-series features), were taken into account in this study. Our evaluation was conducted by using the R-2 score in a cross-validation scheme. The experimental results suggested that time-series modeling can improve the performance of multimodal analysis significantly in the task-dependent setting (with the best R-2 = 0.797 for the total CS index), with word2vec being the most prominent feature. Unfortunately, highly context-related features did not fit well with the task-independent setting. Thus, we propose an enhanced LSTM model for dealing with task-independent settings, and we successfully obtained better performance with the enhanced model than with the conventional SVR and LSTM models (the best R-2 = 0.602 for the total CS index). In other words, our study shows that a particular time-series modeling can outperform traditional nonsequential modeling for automatically estimating the CS indices of a participant in a group discussion with regard to task dependency.

引用

页数：27

共 45 条

[1] THE CONDUCT OF INQUIRY - METHODOLOGY FOR BEHAVIORAL-SCIENCE - KAPLAN,A
ADLER, F
[J]. SOCIAL FORCES, 1965, 44 (01) : 126 - 127
[2] Alam Firoj, 2017, ARXIV170504839
[3] Long short-term memory
Hochreiter, S
Schmidhuber, J
[J]. NEURAL COMPUTATION, 1997, 9 (08) : 1735 - 1780
[4] One of a Kind: Inferring Personality Impressions in Meetings
Aran, Oya
Gatica-Perez, Daniel
[J]. ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 11 - 18
[5] Bachman L. F., 1990, Fundamental considerations in language testing
[6] OpenFace 2.0: Facial Behavior Analysis Toolkit
Baltrusaitis, Tadas
Zadeh, Amir
Lim, Yao Chong
Morency, Louis-Philippe
[J]. PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, : 59 - 66
[7] Bartlett MS, 2005, PROC CVPR IEEE, P568
[8] A Sequential Data Analysis Approach to Detect Emergent Leaders in Small Groups
Beyan, Cigdem
Katsageorgiou, Vasiliki-Maria
Murino, Vittorio
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (08) : 2107 - 2116
[9] Biel JI, 2012, ICMI '12: PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, P53
[10] Bull Peter., 2002, COMMUNICATION MICROS, DOI [10.4324/9780203408025, DOI 10.4324/9780203408025]

← 1 2 3 4 5 →