CorrNet: Fine-Grained Emotion Recognition for Video Watching Using Wearable Physiological Sensors

被引：40

作者：

Zhang, Tianyi ^{[1
,2
]}

El Ali, Abdallah ^{[2
]}

Wang, Chen ^{[3
,4
]}

Hanjalic, Alan ^{[1
]}

Cesar, Pablo ^{[1
,2
]}

机构：

[1] Delft Univ Technol, Multimedia Comp Grp, NL-2600 AA Delft, Netherlands

[2] Ctr Wiskunde & Informat CWI, NL-1098XG Amsterdam, Netherlands

[3] Xinhuanet, Future Media & Convergence Inst, Beijing 100000, Peoples R China

[4] Xinhua News Agcy, State Key Lab Media Convergence Prod Technol & Sy, Beijing 100000, Peoples R China

来源：

SENSORS | 2021年 / 21卷 / 01期

关键词：

emotion recognition; video; physiological signals; machine learning; SYSTEM; TECHNOLOGY; FRAMEWORK; SIGNALS; CONTEXT; SET;

D O I：

10.3390/s21010052

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

Recognizing user emotions while they watch short-form videos anytime and anywhere is essential for facilitating video content customization and personalization. However, most works either classify a single emotion per video stimuli, or are restricted to static, desktop environments. To address this, we propose a correlation-based emotion recognition algorithm (CorrNet) to recognize the valence and arousal (V-A) of each instance (fine-grained segment of signals) using only wearable, physiological signals (e.g., electrodermal activity, heart rate). CorrNet takes advantage of features both inside each instance (intra-modality features) and between different instances for the same video stimuli (correlation-based features). We first test our approach on an indoor-desktop affect dataset (CASE), and thereafter on an outdoor-mobile affect dataset (MERCA) which we collected using a smart wristband and wearable eyetracker. Results show that for subject-independent binary classification (high-low), CorrNet yields promising recognition accuracies: 76.37% and 74.03% for V-A on CASE, and 70.29% and 68.15% for V-A on MERCA. Our findings show: (1) instance segment lengths between 1-4 s result in highest recognition accuracies (2) accuracies between laboratory-grade and wearable sensors are comparable, even under low sampling rates (<= 64 Hz) (3) large amounts of neutral V-A labels, an artifact of continuous affect annotation, result in varied recognition performance.

引用

页码：1 / 25

页数：25

共 123 条

[1] DECAF: MEG-Based Multimodal Database for Decoding Affective Physiological Responses [J].

Abadi, Mojtaba Khomami ;

Subramanian, Ramanathan ;

Kia, Seyed Mostafa ;

Avesani, Paolo ;

Patras, Ioannis ;

Sebe, Nicu .

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2015, 6 (03) :209-222

[2] A Globally Generalized Emotion Recognition System Involving Different Physiological Signals [J].

Ali, Mouhannad ;

Al Machot, Fadi ;

Mosa, Ahmad Haj ;

Jdeed, Midhat ;

Al Machot, Elyan ;

Kyamakya, Kyandoghere .

SENSORS, 2018, 18 (06)

[3] CNN Based Subject-Independent Driver Emotion Recognition System Involving Physiological Signals for ADAS [J].

Ali, Mouhannad ;

Al Machot, Fadi ;

Mosa, Ahmad Haj ;

Kyamakya, Kyandoghere .

ADVANCED MICROSYSTEMS FOR AUTOMOTIVE APPLICATIONS 2016: SMART SYSTEMS FOR THE AUTOMOBILE OF THE FUTURE, 2016, :125-138

[4]

Andrew G., 2013, PMLR, V28, P1247

[5]

[Anonymous], 2000, ISCA TUT RES WORKSH

[6]

[Anonymous], ARXIV12125701

[7]

[Anonymous], 1991, P 3 C MESS UND

[8]

[Anonymous], 2017, ARXIV170808487

[9]

AP S.C., 2014, ADV NEURAL INFORM PR, P1853

[10] Emotion Based Music Recommendation System Using Wearable Physiological Sensors [J].

Ayata, Deger ;

Yaslan, Yusuf ;

Kamasak, Mustafa E. .

IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2018, 64 (02) :196-203

← 1 2 3 4 5 6 7 8 9 10 →