EmMixformer: Mix Transformer for Eye Movement Recognition

被引:0
|
作者
Qin, Huafeng [1 ,2 ]
Zhu, Hongyu [1 ,2 ]
Jin, Xin [1 ,2 ]
Song, Qun [1 ,2 ]
El-Yacoubi, Mounim A. [3 ]
Gao, Xinbo [4 ]
机构
[1] Chongqing Technol & Business Univ, Natl Res Base Intelligent Mfg Serv, Chongqing 400067, Peoples R China
[2] Chongqing Microvein Intelligent Technol Co, Chongqing 400053, Peoples R China
[3] Inst Polytech Paris, SAMOVAR, Telecom SudParis, Palaiseau 91120, France
[4] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Image Cognit, Chongqing 400065, Peoples R China
关键词
Feature extraction; Transformers; Biometrics; Iris recognition; Long short term memory; Gaze tracking; Fourier transforms; Support vector machines; Data mining; Training; eye movements; Fourier transform; long short-term memory (LSTM); Transformer;
D O I
10.1109/TIM.2025.3551452
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Eye movement is a new, highly secure biometric behavioral modality that has received increasing attention in recent years. Although deep neural networks, such as convolutional neural networks (CNNs), have recently achieved promising performance (e.g., achieving the highest recognition accuracy on the GazeBase database), current solutions fail to capture local and global temporal dependencies within eye movement data. To overcome this problem, we propose a mixed Transformer termed EmMixformer to extract time- and frequency-domain information for eye movement recognition in this article. To this end, we propose a mixed block consisting of three modules: a Transformer, attention long short-term memory (LSTM), and a Fourier Transformer. We are the first to attempt leveraging Transformers to learn long temporal dependencies in eye movement. Second, we incorporate the attention mechanism into the LSTM to propose attention LSTM (attLSTM) to learn short temporal dependencies. Third, we perform self-attention in the frequency domain to learn global dependencies and understand the underlying principles of periodicity. As the three modules provide complementary feature representations regarding local and global dependencies, the proposed EmMixformer can improve recognition accuracy. The experimental results on our eye movement dataset and two public eye movement datasets show that the proposed EmMixformer outperforms the state-of-the-art (SOTA) by achieving the lowest verification error. The EMg- lasses database is available at https://github.com/HonyuZhu-s/CTBU-EMglasses-database.
引用
收藏
页数:14
相关论文
共 50 条
  • [11] Vision Transformer With Relation Exploration for Pedestrian Attribute Recognition
    Tan, Hao
    Tan, Zichang
    Weng, Dunfang
    Liu, Ajian
    Wan, Jun
    Lei, Zhen
    Li, Stan Z.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 198 - 208
  • [12] Simplifying Multimodal Emotion Recognition with Single Eye Movement Modality
    Yan, Xu
    Zhao, Li-Ming
    Lu, Bao-Liang
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1057 - 1063
  • [13] Pedestrian Navigation Activity Recognition Based on Segmentation Transformer
    Wang, Qu
    Tao, Zhi
    Ning, Jiahui
    Jiang, Zhuqing
    Guo, Liangliang
    Luo, Haiyong
    Wang, Haiying
    Men, Aidong
    Cheng, Xiaofei
    Zhang, Zhang
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (15): : 26020 - 26032
  • [14] A CNN-Transformer Hybrid Recognition Approach for sEMG-Based Dynamic Gesture Prediction
    Liu, Yanhong
    Li, Xingyu
    Yang, Lei
    Bian, Guibin
    Yu, Hongnian
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [15] EMS: A Large-Scale Eye Movement Dataset, Benchmark, and New Model for Schizophrenia Recognition
    Song, Yingjie
    Liu, Zhi
    Li, Gongyang
    Xie, Jiawei
    Wu, Qiang
    Zeng, Dan
    Xu, Lihua
    Zhang, Tianhong
    Wang, Jijun
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [16] HPMG-Transformer: HP Filter Multi-Scale Gaussian Transformer for Liquor Stock Movement Prediction
    Huang, Lili
    IEEE ACCESS, 2024, 12 : 63885 - 63894
  • [17] Research on athlete’s wrong movement prediction method based on multimodal eye movement recognition
    Wang L.
    International Journal of Reasoning-based Intelligent Systems, 2022, 14 (04) : 176 - 183
  • [18] Zoom Transformer for Skeleton-Based Group Activity Recognition
    Zhang, Jiaxu
    Jia, Yifan
    Xie, Wei
    Tu, Zhigang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8646 - 8659
  • [19] Eye-movement strategies in developmental prosopagnosia and "super" face recognition
    Bobak, Anna K.
    Parris, Benjamin A.
    Gregory, Nicola J.
    Bennetts, Rachel J.
    Bate, Sarah
    QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2017, 70 (02) : 201 - 217
  • [20] Hybrid CNN-Transformer Features for Visual Place Recognition
    Wang, Yuwei
    Qiu, Yuanying
    Cheng, Peitao
    Zhang, Junyu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) : 1109 - 1122