Transformer-based fusion model for mild depression recognition with EEG and pupil area signals

被引:0
|
作者
Zhu, Jing [1 ]
Li, Yuanlong [1 ]
Yang, Changlin [1 ]
Cai, Hanshu [1 ]
Li, Xiaowei [1 ]
Hu, Bin [1 ,2 ,3 ,4 ,5 ]
机构
[1] Lanzhou Univ, Sch Informat Sci & Engn, Gansu Prov Key Lab Wearable Comp, Lanzhou 73000, Peoples R China
[2] Beijing Inst Technol, Sch Med Technol, Beijing, Peoples R China
[3] Chinese Acad Sci, CAS Ctr Excellence Brain Sci & Intelligence Techno, Shanghai Inst Biol Sci, Shanghai 73000, Peoples R China
[4] Lanzhou Univ, Joint Res Ctr Cognit Neurosensor Technol, Lanzhou, Peoples R China
[5] Chinese Acad Sci, Inst Semicond, Lanzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Mild depression; EEG; Pupil area signal; Transformer; Attention;
D O I
10.1007/s11517-024-03269-8
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Early detection and treatment are crucial for the prevention and treatment of depression; compared with major depression, current researches pay less attention to mild depression. Meanwhile, analysis of multimodal biosignals such as EEG, eye movement data, and magnetic resonance imaging provides reliable technical means for the quantitative analysis of depression. However, how to effectively capture relevant and complementary information between multimodal data so as to achieve efficient and accurate depression recognition remains a challenge. This paper proposes a novel Transformer-based fusion model using EEG and pupil area signals for mild depression recognition. We first introduce CSP into the Transformer to construct single-modal models of EEG and pupil data and then utilize attention bottleneck to construct a mid-fusion model to facilitate information exchange between the two modalities; this strategy enables the model to learn the most relevant and complementary information for each modality and only share the necessary information, which improves the model accuracy while reducing the computational cost. Experimental results show that the accuracy of the EEG and pupil area signals of single-modal models we constructed is 89.75% and 84.17%, the precision is 92.04% and 95.21%, the recall is 89.5% and 71%, the specificity is 90% and 97.33%, the F1 score is 89.41% and 78.44%, respectively, and the accuracy of mid-fusion model can reach 93.25%. Our study demonstrates that the Transformer model can learn the long-term time-dependent relationship between EEG and pupil area signals, providing an idea for designing a reliable multimodal fusion model for mild depression recognition based on EEG and pupil area signals.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Unsupervised Transformer-Based Anomaly Detection in ECG Signals
    Alamr, Abrar
    Artoli, Abdelmonim
    ALGORITHMS, 2023, 16 (03)
  • [42] A Transformer-Based Model With Self-Distillation for Multimodal Emotion Recognition in Conversations
    Ma, Hui
    Wang, Jian
    Lin, Hongfei
    Zhang, Bo
    Zhang, Yijia
    Xu, Bo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 776 - 788
  • [43] PETR: Rethinking the Capability of Transformer-Based Language Model in Scene Text Recognition
    Wang, Yuxin
    Xie, Hongtao
    Fang, Shancheng
    Xing, Mengting
    Wang, Jing
    Zhu, Shenggao
    Zhang, Yongdong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5585 - 5598
  • [44] Burned Area and Burn Severity Mapping With a Transformer-Based Change Detection Model
    Han, Yuxin
    Zheng, Change
    Liu, Xiaodong
    Tian, Ye
    Dong, Zixun
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 13866 - 13880
  • [45] CityTransformer: A Transformer-Based Model for Contaminant Dispersion Prediction in a Realistic Urban Area
    Yuuichi Asahi
    Naoyuki Onodera
    Yuta Hasegawa
    Takashi Shimokawabe
    Hayato Shiba
    Yasuhiro Idomura
    Boundary-Layer Meteorology, 2023, 186 : 659 - 692
  • [46] Transformer-based multiview spatiotemporal feature interactive fusion for human action recognition in depth videos
    Wu, Hanbo
    Ma, Xin
    Li, Yibin
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2025, 131
  • [47] MemoCMT: multimodal emotion recognition using cross-modal transformer-based feature fusion
    Khan, Mustaqeem
    Tran, Phuong-Nam
    Pham, Nhat Truong
    El Saddik, Abdulmotaleb
    Othmani, Alice
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [48] Representation, Alignment, Fusion: A Generic Transformer-Based Framework for Multi-modal Glaucoma Recognition
    Zhou, You
    Yang, Gang
    Zhou, Yang
    Ding, Dayong
    Zhao, Jianchun
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VII, 2023, 14226 : 704 - 713
  • [49] EEG-based mild depression recognition using convolutional neural network
    Xiaowei Li
    Rong La
    Ying Wang
    Junhong Niu
    Shuai Zeng
    Shuting Sun
    Jing Zhu
    Medical & Biological Engineering & Computing, 2019, 57 : 1341 - 1352
  • [50] CityTransformer: A Transformer-Based Model for Contaminant Dispersion Prediction in a Realistic Urban Area
    Asahi, Yuuichi
    Onodera, Naoyuki
    Hasegawa, Yuta
    Shimokawabe, Takashi
    Shiba, Hayato
    Idomura, Yasuhiro
    BOUNDARY-LAYER METEOROLOGY, 2023, 186 (03) : 659 - 692