Task-independent Recognition of Communication Skills in Group Interaction Using Time-series Modeling

被引:1
作者
Mawalim, Candy Olivia [1 ]
Okada, Shogo [1 ]
Nakano, Yukiko, I [2 ]
机构
[1] Japan Adv Inst Sci & Technol, 1-1 Asahidai, Nomi, Ishikawa 9231292, Japan
[2] Seikei Univ, Musashino, Tokyo, Japan
基金
日本学术振兴会;
关键词
Multimodal analysis; time-series modeling; task independent; communication skills; group discussion; FACIAL EXPRESSION;
D O I
10.1145/3450283
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Case studies of group discussions are considered an effective way to assess communication skills (CS). This method can help researchers evaluate participants' engagement with each other in a specific realistic context. In this article, multimodal analysis was performed to estimate CS indices using a three-task-type group discussion dataset, the MATRICS corpus. The current research investigated the effectiveness of engaging both static and time-series modeling, especially in task-independent settings. This investigation aimed to understand three main points: first, the effectiveness of time-series modeling compared to nonsequential modeling; second, multimodal analysis in a task-independent setting; and third, important differences to consider when dealing with task-dependent and task-independent settings, specifically in terms of modalities and prediction models. Several modalities were extracted (e.g., acoustics, speaking turns, linguistic-related movement, dialog tags, head motions, and face feature sets) for inferring the CS indices as a regression task. Three predictive models, including support vector regression (SVR), long short-term memory (LSTM), and an enhanced time-series model (an LSTM model with a combination of static and time-series features), were taken into account in this study. Our evaluation was conducted by using the R-2 score in a cross-validation scheme. The experimental results suggested that time-series modeling can improve the performance of multimodal analysis significantly in the task-dependent setting (with the best R-2 = 0.797 for the total CS index), with word2vec being the most prominent feature. Unfortunately, highly context-related features did not fit well with the task-independent setting. Thus, we propose an enhanced LSTM model for dealing with task-independent settings, and we successfully obtained better performance with the enhanced model than with the conventional SVR and LSTM models (the best R-2 = 0.602 for the total CS index). In other words, our study shows that a particular time-series modeling can outperform traditional nonsequential modeling for automatically estimating the CS indices of a participant in a group discussion with regard to task dependency.
引用
收藏
页数:27
相关论文
共 45 条
  • [1] THE CONDUCT OF INQUIRY - METHODOLOGY FOR BEHAVIORAL-SCIENCE - KAPLAN,A
    ADLER, F
    [J]. SOCIAL FORCES, 1965, 44 (01) : 126 - 127
  • [2] Alam Firoj, 2017, ARXIV170504839
  • [3] Long short-term memory
    Hochreiter, S
    Schmidhuber, J
    [J]. NEURAL COMPUTATION, 1997, 9 (08) : 1735 - 1780
  • [4] One of a Kind: Inferring Personality Impressions in Meetings
    Aran, Oya
    Gatica-Perez, Daniel
    [J]. ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 11 - 18
  • [5] Bachman L. F., 1990, Fundamental considerations in language testing
  • [6] OpenFace 2.0: Facial Behavior Analysis Toolkit
    Baltrusaitis, Tadas
    Zadeh, Amir
    Lim, Yao Chong
    Morency, Louis-Philippe
    [J]. PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, : 59 - 66
  • [7] Bartlett MS, 2005, PROC CVPR IEEE, P568
  • [8] A Sequential Data Analysis Approach to Detect Emergent Leaders in Small Groups
    Beyan, Cigdem
    Katsageorgiou, Vasiliki-Maria
    Murino, Vittorio
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (08) : 2107 - 2116
  • [9] Biel JI, 2012, ICMI '12: PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, P53
  • [10] Bull Peter., 2002, COMMUNICATION MICROS, DOI [10.4324/9780203408025, DOI 10.4324/9780203408025]