Multimodal Fusion of Spatial-Temporal Features for Emotion Recognition in the Wild

被引:0
|
作者
Wang, Zuchen [1 ]
Fang, Yuchun [1 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai, Peoples R China
来源
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT I | 2018年 / 10735卷
基金
中国国家自然科学基金;
关键词
Emotion recognition; Multimodal fusion; Spatial-temporal features;
D O I
10.1007/978-3-319-77380-3_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Making the machine understand human emotion is a challenge to realize artificial intelligence. Considering the temporal correlation widely exists in the video, we present a multimodal fusion of spatial-temporal features system to recognize emotion. For the visual modality, the spatial-temporal features are extracted to represent the dynamic emotional variance along with the facial action in the video. The audio modality is utilized to assist the visual modality. A decision-level fusion approach is presented to make full use of the complementarity between visual modality and audio modality to boost the performance of the emotion recognition system. The experiments on a challenging dataset AFEW4.0 show that the proposed system achieves better generalization performance compared with other state-of-the-art methods.
引用
收藏
页码:205 / 214
页数:10
相关论文
共 50 条
  • [1] Video Emotion Recognition in the Wild Based on Fusion of Multimodal Features
    Chen, Shizhe
    Li, Xinrui
    Jin, Qin
    Zhang, Shilei
    Qin, Yong
    ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 494 - 500
  • [2] Combining Multimodal Features within a Fusion Network for Emotion Recognition in the Wild
    Sun, Bo
    Li, Liandong
    Zhou, Guoyan
    Wu, Xuewen
    He, Jun
    Yu, Lejun
    Li, Dongxue
    Wei, Qinglan
    ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2015, : 497 - 502
  • [3] Transformer-Based Multimodal Spatial-Temporal Fusion for Gait Recognition
    Zhang, Jikai
    Ji, Mengyu
    He, Yihao
    Guo, Dongliang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XV, 2025, 15045 : 494 - 507
  • [4] Fusion of spatial-temporal and kinematic features for gait recognition with deterministic learning
    Deng, Muqing
    Wang, Cong
    Cheng, Fengjiang
    Zeng, Wei
    PATTERN RECOGNITION, 2017, 67 : 186 - 200
  • [5] Leveraging spatial-temporal convolutional features for EEG-based emotion recognition
    An, Yi
    Xu, Ning
    Qu, Zhen
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 69
  • [6] Speech emotion recognition via multiple fusion under spatial-temporal parallel network
    Gan, Chenquan
    Wang, Kexin
    Zhu, Qingyi
    Xiang, Yong
    Jain, Deepak Kumar
    Garcia, Salvador
    NEUROCOMPUTING, 2023, 555
  • [7] Spatial-Temporal Feature Fusion Neural Network for EEG-Based Emotion Recognition
    Wang, Zhe
    Wang, Yongxiong
    Zhang, Jiapeng
    Hu, Chuanfei
    Yin, Zhong
    Song, Yu
    IEEE Transactions on Instrumentation and Measurement, 2022, 71
  • [8] Spatial-Temporal Feature Fusion Neural Network for EEG-Based Emotion Recognition
    Wang, Zhe
    Wang, Yongxiong
    Zhang, Jiapeng
    Hu, Chuanfei
    Yin, Zhong
    Song, Yu
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [9] Spatial-Temporal Recurrent Neural Network for Emotion Recognition
    Zhang, Tong
    Zheng, Wenming
    Cui, Zhen
    Zong, Yuan
    Li, Yang
    IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (03) : 839 - 847
  • [10] Multimodal Fusion based on Information Gain for Emotion Recognition in the Wild
    Ghaleb, Esam
    Popa, Mirela
    Hortal, Enrique
    Asteriadis, Stylianos
    PROCEEDINGS OF THE 2017 INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS), 2017, : 814 - 823