Task-independent Recognition of Communication Skills in Group Interaction Using Time-series Modeling

被引：1

作者：

Mawalim, Candy Olivia ^{[1
]}

Okada, Shogo ^{[1
]}

Nakano, Yukiko, I ^{[2
]}

机构：

[1] Japan Adv Inst Sci & Technol, 1-1 Asahidai, Nomi, Ishikawa 9231292, Japan

[2] Seikei Univ, Musashino, Tokyo, Japan

来源：

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS | 2021年 / 17卷 / 04期

基金：

日本学术振兴会;

关键词：

Multimodal analysis; time-series modeling; task independent; communication skills; group discussion; FACIAL EXPRESSION;

D O I：

10.1145/3450283

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Case studies of group discussions are considered an effective way to assess communication skills (CS). This method can help researchers evaluate participants' engagement with each other in a specific realistic context. In this article, multimodal analysis was performed to estimate CS indices using a three-task-type group discussion dataset, the MATRICS corpus. The current research investigated the effectiveness of engaging both static and time-series modeling, especially in task-independent settings. This investigation aimed to understand three main points: first, the effectiveness of time-series modeling compared to nonsequential modeling; second, multimodal analysis in a task-independent setting; and third, important differences to consider when dealing with task-dependent and task-independent settings, specifically in terms of modalities and prediction models. Several modalities were extracted (e.g., acoustics, speaking turns, linguistic-related movement, dialog tags, head motions, and face feature sets) for inferring the CS indices as a regression task. Three predictive models, including support vector regression (SVR), long short-term memory (LSTM), and an enhanced time-series model (an LSTM model with a combination of static and time-series features), were taken into account in this study. Our evaluation was conducted by using the R-2 score in a cross-validation scheme. The experimental results suggested that time-series modeling can improve the performance of multimodal analysis significantly in the task-dependent setting (with the best R-2 = 0.797 for the total CS index), with word2vec being the most prominent feature. Unfortunately, highly context-related features did not fit well with the task-independent setting. Thus, we propose an enhanced LSTM model for dealing with task-independent settings, and we successfully obtained better performance with the enhanced model than with the conventional SVR and LSTM models (the best R-2 = 0.602 for the total CS index). In other words, our study shows that a particular time-series modeling can outperform traditional nonsequential modeling for automatically estimating the CS indices of a participant in a group discussion with regard to task dependency.

引用

页数：27

共 45 条

[21] Using Interlocutor-Modulated Attention BLSTM to Predict Personality Traits in Small Group Interaction [J].

Lin, Yun-Shao ;

Lee, Chi-Chun .

ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2018, :163-169

[22]

Lu Maggie., 2002, HARVARD BUSINESS SCH

[23] Multimodal BigFive Personality Trait Analysis Using Communication Skill Indices and Multiple Discussion Types Dataset [J].

Mawalim, Candy Olivia ;

Okada, Shogo ;

Nakano, Yukiko, I ;

Unoki, Masashi .

SOCIAL COMPUTING AND SOCIAL MEDIA: DESIGN, HUMAN BEHAVIOR AND ANALYTICS, SCSM 2019, PT I, 2019, 11578 :370-383

[24] Recent trends in deep learning based personality detection [J].

Mehta, Yash ;

Majumder, Navonil ;

Gelbukh, Alexander ;

Cambria, Erik .

ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (04) :2313-2339

[25]

Mikolov T., 2013, ArXiv13013781 Cs

[26] Automated Analysis and Prediction of Job Interview Performance [J].

Naim, Iftekhar ;

Tanveer, Md. Iftekhar ;

Gildea, Daniel ;

Hoque, Mohammed Ehsan .

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2018, 9 (02) :191-204

[27]

Nihei F., 2014, 16 INT C MULT INT IC, P136

[28] Estimating Communication Skills using Dialogue Acts and Nonverbal Features in Multiple Discussion Datasets [J].

Okada, Shogo ;

Ohtake, Yoshihiko ;

Nakano, Yukiko I. ;

Hayashi, Yuki ;

Huang, Hung-Hsung ;

Takase, Yutaka ;

Nitta, Katsumi .

ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, :169-176

[29]

Okada Shogo, 2016, T JAPAN SOC ARTIFIC, V31, DOI [10.1527/tjsai.AI30-E, DOI 10.1527/TJSAI.AI30-E]

[30]

Park S., 2014, International Conference on Multimodal Interaction, P50, DOI DOI 10.1145/2663204.2663260

← 1 2 3 4 5 →