One-Shot Only Real-Time Video Classification: A Case Study in Facial Emotion Recognition

被引:3
作者
Basbrain, Arwa [1 ,2 ]
Gan, John Q. [1 ]
机构
[1] Univ Essex, Sch Comp Sci & Elect Engn, Colchester, Essex, England
[2] King Abdulaziz Univ, Fac Comp & Informat Technol, Jeddah, Saudi Arabia
来源
INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2020, PT I | 2020年 / 12489卷
关键词
Video-based facial emotion recognition; Convolutional neural network; Spatial-temporal data fusion;
D O I
10.1007/978-3-030-62362-3_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video classification is an important research field due to its applications ranging from human action recognition for video surveillance to emotion recognition for human-computer interaction. This paper proposes a new method called One-Shot Only (OSO) for real-time video classification with a case study in facial emotion recognition. Instead of using 3D convolutional neural networks (CNN) or multiple 2D CNNs with decision fusion as in the previous studies, the OSO method tackles video classification as a single image classification problem by spatially rearranging video frames using frame selection or clustering strategies to form a simple representative storyboard for spatio-temporal video information fusion. It uses a single 2D CNN for video classification and thus can be optimised end-to-end directly in terms of the classification accuracy. Experimental results show that the OSO method proposed in this paper outperformed multiple 2D CNNs with decision fusion by a large margin in terms of classification accuracy (by up to 13%) on the AFEW 7.0 dataset for video classification. It is also very fast, up to ten times faster than the commonly used 2D CNN architectures for video classification.
引用
收藏
页码:197 / 208
页数:12
相关论文
共 30 条
[1]  
[Anonymous], 1991, Technical report
[2]   Emotion Recognition in the Wild from Videos using Images [J].
Bargal, Sarah Adel ;
Barsoum, Emad ;
Ferrer, Cristian Canton ;
Zhang, Cha .
ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, :433-436
[3]   Facial Expression Recognition in Video with Multiple Feature Fusion [J].
Chen, Junkai ;
Chen, Zenghai ;
Chi, Zheru ;
Fu, Hong .
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2018, 9 (01) :38-50
[4]   Video and Image based Emotion Recognition Challenges in the Wild: EmotiW 2015 [J].
Dhall, Abhinav ;
Murthy, O. V. Ramana ;
Goecke, Roland ;
Joshi, Jyoti ;
Gedeon, Tom .
ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2015, :423-426
[5]   Emotion Recognition In The Wild Challenge 2013 [J].
Dhall, Abhinav ;
Goecke, Roland ;
Joshi, Jyoti ;
Wagner, Michael ;
Gedeon, Tom .
ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, :509-515
[6]   Collecting Large, Richly Annotated Facial-Expression Databases from Movies [J].
Dhall, Abhinav ;
Goecke, Roland ;
Lucey, Simon ;
Gedeon, Tom .
IEEE MULTIMEDIA, 2012, 19 (03) :34-41
[7]   Audio and Face Video Emotion Recognition in the Wild using Deep Neural Networks and Small Datasets [J].
Ding, Wan ;
Xu, Mingyu ;
Huang, Dongyan ;
Lin, Weisi ;
Dong, Minghui ;
Yu, Xinguo ;
Li, Haizhou .
ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, :506-513
[8]  
Doherty AidenR., 2008, INT C CONTENT BASED, P259
[9]   CONSTANTS ACROSS CULTURES IN FACE AND EMOTION [J].
EKMAN, P ;
FRIESEN, WV .
JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 1971, 17 (02) :124-&
[10]   Video-Based Emotion Recognition using CNN-RNN and C3D Hybrid Networks [J].
Fan, Yin ;
Lu, Xiangju ;
Li, Dian ;
Liu, Yuanliu .
ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, :445-450