Spatio-Temporal Encoder-Decoder Fully Convolutional Network for Video-Based Dimensional Emotion Recognition

被引:20
|
作者
Du, Zhengyin [1 ]
Wu, Suowei [2 ]
Huang, Di [1 ]
Li, Weixin [3 ]
Wang, Yunhong [3 ]
机构
[1] Beihang Univ, Beijing Adv Innovat Ctr Big Data & Brain Comp, Sch Comp Sci & Engn, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China
[2] Beihang Univ, Beijing Adv Innovat Ctr Big Data & Brain Comp, Sino French Engineer Sch, Beijing 100191, Peoples R China
[3] Beihang Univ, Beijing Adv Innovat Ctr Big Data & Brain Comp, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
Emotion recognition; Convolution; Decoding; Feature extraction; Videos; Visualization; Task analysis; Dimensional emotion recognition; spatio-temporal fully convolutional network; temporal hourglass CNN; temporal intermediate supervision; EXPRESSION RECOGNITION;
D O I
10.1109/TAFFC.2019.2940224
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video-based dimensional emotion recognition aims to map human affect into the dimensional emotion space based on visual signals, which is a fundamental challenge in affective computing and human-computer interaction. In this paper, we present a novel encoder-decoder framework to tackle this problem. It adopts a fully convolutional design with the cascaded 2D convolution based spatial encoder and 1D convolution based temporal encoder-decoder for joint spatio-temporal modeling. In particular, to address the key issue of capturing discriminative long-term dynamic dependency, our temporal model, referred to as Temporal Hourglass Convolutional Neural Network (TH-CNN), extracts contextual relationship through integrating both low-level encoded and high-level decoded clues. Temporal Intermediate Supervision (TIS) is then introduced to enhance affective representations generated by TH-CNN under a multi-resolution strategy, which guides TH-CNN to learn macroscopic long-term trend and refined short-term fluctuations progressively. Furthermore, thanks to TH-CNN and TIS, knowledge learnt from the intermediate layers also makes it possible to offer customized solutions to different applications by adjusting the decoder depth. Extensive experiments are conducted on three benchmark databases (RECOLA, SEWA and OMG) and superior results are shown compared to state-of-the-art methods, which indicates the effectiveness of the proposed approach.
引用
收藏
页码:565 / 578
页数:14
相关论文
共 50 条
  • [31] Spatio-Temporal Digraph Convolutional Network-Based Taxi Pickup Location Recommendation
    Zhang, Yan
    Shen, Guojiang
    Han, Xiao
    Wang, Wei
    Kong, Xiangjie
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (01) : 394 - 403
  • [32] Fault Diagnosis of Gearbox Based on Refined Topology and Spatio-Temporal Graph Convolutional Network
    Xiang, Wei
    Liu, Shujie
    Li, Hongkun
    Cao, Shunxin
    Zhang, Kongliang
    Yang, Chen
    IEEE SENSORS JOURNAL, 2024, 24 (02) : 1866 - 1879
  • [33] Gait Emotion Recognition Based on a Multi-scale Partitioning Directed Spatio-temporal Graph
    Zhang J.
    Gao J.
    Huang Z.
    Xu G.
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (03): : 1069 - 1078
  • [34] NeuroSense: Short-term emotion recognition and understanding based on spiking neural network modelling of spatio-temporal EEG patterns
    Tan, Clarence
    Sarlija, Marko
    Kasabov, Nikola
    NEUROCOMPUTING, 2021, 434 : 137 - 148
  • [35] Compound Emotion Recognition of Autistic Children During Meltdown Crisis Based on Deep Spatio-Temporal Analysis of Facial Geometric Features
    Jarraya, Salma Kammoun
    Masmoudi, Marwa
    Hammami, Mohamed
    IEEE ACCESS, 2020, 8 : 69311 - 69326
  • [36] EEG-based multi-frequency band functional connectivity analysis and the application of spatio-temporal features in emotion recognition
    Zhang, Yuchan
    Yan, Guanghui
    Chang, Wenwen
    Huang, Wenqie
    Yuan, Yueting
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 79
  • [37] Human behaviour analysis based on spatio-temporal dual-stream heterogeneous convolutional neural network
    Ye, Qing
    Zhao, Yuqi
    Zhong, Haoxin
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2023, 26 (06) : 673 - 683
  • [38] Spatio-Temporal-Frequency Graph Attention Convolutional Network for Aircraft Recognition Based on Heterogeneous Radar Network
    Meng, Han
    Peng, Yuexing
    Wang, Wenbo
    Cheng, Peng
    Li, Yonghui
    Xiang, Wei
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2022, 58 (06) : 5548 - 5559
  • [39] Glimpse and Zoom: Spatio-Temporal Focused Dynamic Network for Skeleton-Based Action Recognition
    Zhao, Zhifu
    Chen, Ziwei
    Li, Jianan
    Wang, Xiaotian
    Xie, Xuemei
    Huang, Lei
    Zhang, Wanxin
    Shi, Guangming
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 5616 - 5629
  • [40] A Random Forest Weights and 4-Dimensional Convolutional Recurrent Neural Network for EEG Based Emotion Recognition
    Wang, Wenxu
    Yang, Jia
    Li, Shengjia
    Wang, Bin
    Yang, Kun
    Sang, Shengbo
    Zhang, Qiang
    Liu, Boyuan
    IEEE ACCESS, 2024, 12 : 39549 - 39563