EmMixformer: Mix Transformer for Eye Movement Recognition

被引：0

作者：

Qin, Huafeng ^{[1
,2
]}

Zhu, Hongyu ^{[1
,2
]}

Jin, Xin ^{[1
,2
]}

Song, Qun ^{[1
,2
]}

El-Yacoubi, Mounim A. ^{[3
]}

Gao, Xinbo ^{[4
]}

机构：

[1] Chongqing Technol & Business Univ, Natl Res Base Intelligent Mfg Serv, Chongqing 400067, Peoples R China

[2] Chongqing Microvein Intelligent Technol Co, Chongqing 400053, Peoples R China

[3] Inst Polytech Paris, SAMOVAR, Telecom SudParis, Palaiseau 91120, France

[4] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Image Cognit, Chongqing 400065, Peoples R China

来源：

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT | 2025年 / 74卷

关键词：

Feature extraction; Transformers; Biometrics; Iris recognition; Long short term memory; Gaze tracking; Fourier transforms; Support vector machines; Data mining; Training; eye movements; Fourier transform; long short-term memory (LSTM); Transformer;

D O I：

10.1109/TIM.2025.3551452

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Eye movement is a new, highly secure biometric behavioral modality that has received increasing attention in recent years. Although deep neural networks, such as convolutional neural networks (CNNs), have recently achieved promising performance (e.g., achieving the highest recognition accuracy on the GazeBase database), current solutions fail to capture local and global temporal dependencies within eye movement data. To overcome this problem, we propose a mixed Transformer termed EmMixformer to extract time- and frequency-domain information for eye movement recognition in this article. To this end, we propose a mixed block consisting of three modules: a Transformer, attention long short-term memory (LSTM), and a Fourier Transformer. We are the first to attempt leveraging Transformers to learn long temporal dependencies in eye movement. Second, we incorporate the attention mechanism into the LSTM to propose attention LSTM (attLSTM) to learn short temporal dependencies. Third, we perform self-attention in the frequency domain to learn global dependencies and understand the underlying principles of periodicity. As the three modules provide complementary feature representations regarding local and global dependencies, the proposed EmMixformer can improve recognition accuracy. The experimental results on our eye movement dataset and two public eye movement datasets show that the proposed EmMixformer outperforms the state-of-the-art (SOTA) by achieving the lowest verification error. The EMg- lasses database is available at https://github.com/HonyuZhu-s/CTBU-EMglasses-database.

引用

页数：14

共 50 条

[11] Vision Transformer With Relation Exploration for Pedestrian Attribute Recognition
Tan, Hao
Tan, Zichang
Weng, Dunfang
Liu, Ajian
Wan, Jun
Lei, Zhen
Li, Stan Z.
IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 198 - 208
[12] Simplifying Multimodal Emotion Recognition with Single Eye Movement Modality
Yan, Xu
Zhao, Li-Ming
Lu, Bao-Liang
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1057 - 1063
[13] Pedestrian Navigation Activity Recognition Based on Segmentation Transformer
Wang, Qu
Tao, Zhi
Ning, Jiahui
Jiang, Zhuqing
Guo, Liangliang
Luo, Haiyong
Wang, Haiying
Men, Aidong
Cheng, Xiaofei
Zhang, Zhang
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (15): : 26020 - 26032
[14] A CNN-Transformer Hybrid Recognition Approach for sEMG-Based Dynamic Gesture Prediction
Liu, Yanhong
Li, Xingyu
Yang, Lei
Bian, Guibin
Yu, Hongnian
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[15] EMS: A Large-Scale Eye Movement Dataset, Benchmark, and New Model for Schizophrenia Recognition
Song, Yingjie
Liu, Zhi
Li, Gongyang
Xie, Jiawei
Wu, Qiang
Zeng, Dan
Xu, Lihua
Zhang, Tianhong
Wang, Jijun
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
[16] HPMG-Transformer: HP Filter Multi-Scale Gaussian Transformer for Liquor Stock Movement Prediction
Huang, Lili
IEEE ACCESS, 2024, 12 : 63885 - 63894
[17] Research on athlete’s wrong movement prediction method based on multimodal eye movement recognition
Wang L.
International Journal of Reasoning-based Intelligent Systems, 2022, 14 (04) : 176 - 183
[18] Zoom Transformer for Skeleton-Based Group Activity Recognition
Zhang, Jiaxu
Jia, Yifan
Xie, Wei
Tu, Zhigang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8646 - 8659
[19] Eye-movement strategies in developmental prosopagnosia and "super" face recognition
Bobak, Anna K.
Parris, Benjamin A.
Gregory, Nicola J.
Bennetts, Rachel J.
Bate, Sarah
QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2017, 70 (02) : 201 - 217
[20] Hybrid CNN-Transformer Features for Visual Place Recognition
Wang, Yuwei
Qiu, Yuanying
Cheng, Peitao
Zhang, Junyu
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) : 1109 - 1122

← 1 2 3 4 5 →