Transformer-based fusion model for mild depression recognition with EEG and pupil area signals

被引:0
|
作者
Zhu, Jing [1 ]
Li, Yuanlong [1 ]
Yang, Changlin [1 ]
Cai, Hanshu [1 ]
Li, Xiaowei [1 ]
Hu, Bin [1 ,2 ,3 ,4 ,5 ]
机构
[1] Lanzhou Univ, Sch Informat Sci & Engn, Gansu Prov Key Lab Wearable Comp, Lanzhou 73000, Peoples R China
[2] Beijing Inst Technol, Sch Med Technol, Beijing, Peoples R China
[3] Chinese Acad Sci, CAS Ctr Excellence Brain Sci & Intelligence Techno, Shanghai Inst Biol Sci, Shanghai 73000, Peoples R China
[4] Lanzhou Univ, Joint Res Ctr Cognit Neurosensor Technol, Lanzhou, Peoples R China
[5] Chinese Acad Sci, Inst Semicond, Lanzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Mild depression; EEG; Pupil area signal; Transformer; Attention;
D O I
10.1007/s11517-024-03269-8
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Early detection and treatment are crucial for the prevention and treatment of depression; compared with major depression, current researches pay less attention to mild depression. Meanwhile, analysis of multimodal biosignals such as EEG, eye movement data, and magnetic resonance imaging provides reliable technical means for the quantitative analysis of depression. However, how to effectively capture relevant and complementary information between multimodal data so as to achieve efficient and accurate depression recognition remains a challenge. This paper proposes a novel Transformer-based fusion model using EEG and pupil area signals for mild depression recognition. We first introduce CSP into the Transformer to construct single-modal models of EEG and pupil data and then utilize attention bottleneck to construct a mid-fusion model to facilitate information exchange between the two modalities; this strategy enables the model to learn the most relevant and complementary information for each modality and only share the necessary information, which improves the model accuracy while reducing the computational cost. Experimental results show that the accuracy of the EEG and pupil area signals of single-modal models we constructed is 89.75% and 84.17%, the precision is 92.04% and 95.21%, the recall is 89.5% and 71%, the specificity is 90% and 97.33%, the F1 score is 89.41% and 78.44%, respectively, and the accuracy of mid-fusion model can reach 93.25%. Our study demonstrates that the Transformer model can learn the long-term time-dependent relationship between EEG and pupil area signals, providing an idea for designing a reliable multimodal fusion model for mild depression recognition based on EEG and pupil area signals.
引用
收藏
页数:17
相关论文
共 50 条
  • [11] DSTM: A transformer-based model with dynamic-static feature fusion in speech emotion recognition
    Jin, Guowei
    Xu, Yunfeng
    Kang, Hong
    Wang, Jialin
    Miao, Borui
    COMPUTER SPEECH AND LANGUAGE, 2025, 90
  • [12] Content-based multiple evidence fusion on EEG and eye movements for mild depression recognition
    Zhu, Jing
    Wei, Shiqing
    Yang, Changlin
    Xie, Xiannian
    Li, Yizhou
    Li, Xiaowei
    Hu, Bin
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2022, 226
  • [13] Transformer-Based Multimodal Spatial-Temporal Fusion for Gait Recognition
    Zhang, Jikai
    Ji, Mengyu
    He, Yihao
    Guo, Dongliang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XV, 2025, 15045 : 494 - 507
  • [14] Multimodal Emotion Recognition With Transformer-Based Self Supervised Feature Fusion
    Siriwardhana, Shamane
    Kaluarachchi, Tharindu
    Billinghurst, Mark
    Nanayakkara, Suranga
    IEEE ACCESS, 2020, 8 (08): : 176274 - 176285
  • [15] ViT-LLMR: Vision Transformer-based lower limb motion recognition from fusion signals of MMG and IMU
    Zhang, Hanyang
    Yang, Ke
    Cao, Gangsheng
    Xia, Chunming
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 82
  • [16] Robust Multimodal Emotion Recognition from Conversation with Transformer-Based Crossmodality Fusion
    Xie, Baijun
    Sidulova, Mariia
    Park, Chung Hyuk
    SENSORS, 2021, 21 (14)
  • [17] A transformer-based network for speech recognition
    Tang L.
    International Journal of Speech Technology, 2023, 26 (02) : 531 - 539
  • [18] MTNet: Multimodal transformer network for mild depression detection through fusion of EEG and eye tracking
    Zhu, Feiyu
    Zhang, Jing
    Dang, Ruochen
    Hu, Bingliang
    Wang, Quan
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 100
  • [19] WFormer: A Transformer-Based Soft Fusion Model for Robust Image Watermarking
    Luo, Ting
    Wu, Jun
    He, Zhouyan
    Xu, Haiyong
    Jiang, Gangyi
    Chang, Chin-Chen
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, : 1 - 18
  • [20] Jump motion intention recognition and brain activity analysis based on EEG signals and Vision Transformer model
    Lu, Yanzheng
    Wang, Hong
    Niu, Jianye
    Lu, Zhiguo
    Liu, Chong
    Feng, Naishi
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 100