Emotion recognition using hierarchical spatial-temporal learning transformer from regional to global brain

被引:4
|
作者
Cheng, Cheng [1 ]
Liu, Wenzhe [2 ]
Feng, Lin [1 ,3 ]
Jia, Ziyu [4 ]
机构
[1] Dalian Univ Technol, Dept Comp Sci & Technol, Dalian, Peoples R China
[2] Huzhou Univ, Sch Informat Engn, Huzhou, Peoples R China
[3] Dalian Minzu Univ, Sch Informat & Commun Engn, Dlian, Peoples R China
[4] Univ Chinese Acad Sci, Chinese Acad Sci, Brainnetome Ctr, Inst Automat, Beijing, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Emotion recognition; Electroencephalogram (EEG); Transformer; Spatiotemporal features; EEG; FUSION;
D O I
10.1016/j.neunet.2024.106624
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition is an essential but challenging task in human-computer interaction systems due to the distinctive spatial structures and dynamic temporal dependencies associated with each emotion. However, current approaches fail to accurately capture the intricate effects of electroencephalogram (EEG) signals across different brain regions on emotion recognition. Therefore, this paper designs a transformer-based method, denoted by R2G-STLT, which relies on a spatial-temporal transformer encoder with regional to global hierarchical learning that learns the representative spatiotemporal features from the electrode level to the brain-region level. The regional spatial-temporal transformer (RST-Trans) encoder is designed to obtain spatial information and context dependence at the electrode level aiming to learn the regional spatiotemporal features. Then, the global spatial-temporal transformer (GST-Trans) encoder is utilized to extract reliable global spatiotemporal features, reflecting the impact of various brain regions on emotion recognition tasks. Moreover, the multi-head attention mechanism is placed into the GST-Trans encoder to empower it to capture the longrange spatial-temporal information among the brain regions. Finally, subject-independent experiments are conducted on each frequency band of the DEAP, SEED, and SEED-IV datasets to assess the performance of the proposed model. Results indicate that the R2G-STLT model surpasses several state-of-the-art approaches.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Automatic Emotion Recognition Using Temporal Multimodal Deep Learning
    Nakisa, Bahareh
    Rastgoo, Mohammad Naim
    Rakotonirainy, Andry
    Maire, Frederic
    Chandran, Vinod
    IEEE ACCESS, 2020, 8 : 225463 - 225474
  • [32] EEG-Based Emotion Recognition Using Spatial-Temporal Graph Convolutional LSTM With Attention Mechanism
    Feng, Lin
    Cheng, Cheng
    Zhao, Mingyan
    Deng, Huiyuan
    Zhang, Yong
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (11) : 5406 - 5417
  • [33] Tourism demand forecasting: a deep learning model based on spatial-temporal transformer
    Chen, Jiaying
    Li, Cheng
    Huang, Liyao
    Zheng, Weimin
    TOURISM REVIEW, 2025, 80 (03) : 648 - 663
  • [34] STSD: spatial-temporal semantic decomposition transformer for skeleton-based action recognition
    Cui, Hu
    Hayama, Tessai
    MULTIMEDIA SYSTEMS, 2024, 30 (01)
  • [35] An End-to-End Spatial-Temporal Transformer Model for Surgical Action Triplet Recognition
    Zou, Xiaoyang
    Yu, Derong
    Tao, Rong
    Zheng, Guoyan
    12TH ASIAN-PACIFIC CONFERENCE ON MEDICAL AND BIOLOGICAL ENGINEERING, VOL 2, APCMBE 2023, 2024, 104 : 114 - 120
  • [36] Enhanced spatial-temporal learning network for dynamic facial expression recognition
    Gong, Weijun
    Qian, Yurong
    Zhou, Weihang
    Leng, Hongyong
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 88
  • [37] A Spatial-Temporal Transformer Architecture Using Multi-Channel Signals for Sleep Stage Classification
    Yao, Haotian
    Liu, Tao
    Zou, Ruiyang
    Ding, Shengnan
    Xu, Yan
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2023, 31 : 3353 - 3362
  • [38] Emotion recognition based on microstate analysis from temporal and spatial patterns of electroencephalogram
    Wei, Zhen
    Li, Hongwei
    Ma, Lin
    Li, Haifeng
    FRONTIERS IN NEUROSCIENCE, 2024, 18
  • [39] Emotion Recognition from Human Speech Using Temporal Information and Deep Learning
    Kim, John W.
    Saurous, Rif A.
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 937 - 940
  • [40] Multiscale spatial-temporal transformer with consistency representation learning for multivariate time series classification
    Wu, Wei
    Qiu, Feiyue
    Wang, Liping
    Liu, Yanxiu
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (27)