Improved Multimodal Emotion Recognition for Better Game-Based Learning

被引：4

作者：

Bahreini, Kiavash ^{[1
]}

Nadolski, Rob ^{[1
]}

Westera, Wim ^{[1
]}

机构：

[1] Open Univ Netherlands, Fac Psychol & Educ Sci, Res Ctr Learning Teaching & Technol, Welten Inst, NL-6419 AT Heerlen, Netherlands

来源：

GAMES AND LEARNING ALLIANCE, GALA 2014 | 2015年 / 9221卷

关键词：

Game-based learning; Human-computer interaction; Multimodal emotion recognition; Real-time emotion recognition; Affective computing; Webcam; Microphone; SERIOUS GAMES; EXPRESSIONS; AUDIO;

D O I：

10.1007/978-3-319-22960-7_11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper introduces the integration of the face emotion recognition part and the voice emotion recognition part of our FILTWAM framework that uses webcams and microphones. This framework enables real-time multimodal emotion recognition of learners during game-based learning for triggering feedback towards improved learning. The main goal of this study is to validate the integration of webcam and microphone data for a real-time and adequate interpretation of facial and vocal expressions into emotional states where the software modules are calibrated with end users. This integration aims to improve timely and relevant feedback, which is expected to increase learners' awareness of their own behavior. Twelve test persons received the same computer-based tasks in which they were requested to mimic specific facial and vocal expressions. Each test person mimicked 80 emotions, which led to a dataset of 960 emotions. All sessions were recorded on video. An overall accuracy of Kappa value based on the requested emotions, expert opinions, and the recognized emotions is 0.61, of the face emotion recognition software is 0.76, and of the voice emotion recognition software is 0.58. A multimodal fusion between the software modules can increase the accuracy to 78 %. In contrast with existing software our software modules allow real-time, continuously and unobtrusively monitoring of learners' face expressions and voice intonations and convert these into emotional states. This inclusion of learner's emotional states paves the way for more effective, efficient and enjoyable game-based learning.

引用

页码：107 / 120

页数：14

共 29 条

[1]

Anaraki F., 2004, International Journal of the Computer, the Internet, and Management, V12, P57

[2]

[Anonymous], COMMUNICATION ORG BA

[3]

[Anonymous], P WORKSH PERC INT TE

[4]

[Anonymous], PSYCHOL GESPREKSVOER

[5]

[Anonymous], 1973, PICTURE PROCESSING S

[6]

[Anonymous], THE HUMAINE HDB

[7]

[Anonymous], 2000, THESIS

[8] Multimodal fusion for multimedia analysis: a survey [J].

Atrey, Pradeep K. ;

Hossain, M. Anwar ;

El Saddik, Abdulmotaleb ;

Kankanhalli, Mohan S. .

MULTIMEDIA SYSTEMS, 2010, 16 (06) :345-379

[9] FILTWAM and Voice Emotion Recognition [J].

Bahreini, Kiavash ;

Nadolski, Rob ;

Westera, Wim .

GAMES AND LEARNING ALLIANCE, 2014, 8605 :116-129

[10]

Bahreini K, 2012, PROCEEDINGS OF THE 6TH EUROPEAN CONFERENCE ON GAMES BASED LEARNING, P39

← 1 2 3 →