TransGait: Multimodal-based gait recognition with set transformer

被引:30
作者
Li, Guodong [1 ]
Guo, Lijun [1 ]
Zhang, Rong [1 ]
Qian, Jiangbo [1 ]
Gao, Shangce [2 ]
机构
[1] NingBo Univ, Fac Elect Engn & Comp Sci, Ningbo, Peoples R China
[2] Univ Toyama, Toyama, Japan
关键词
Gait recognition; Multi-modal; Transformer;
D O I
10.1007/s10489-022-03543-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As a biological feature that can be recognized from a distance, gait has a wide range of applications such as crime prevention, judicial identification, and social security. However, gait recognition is still a challenging task with two problems in the typical gait recognition methods. First, the existing gait recognition methods have weak robustness to the pedestrians' clothing and carryings. Second, the existing temporal modeling methods for gait recognition fail to fully exploit the temporal relationships of the sequence and require that the gait sequence maintain unnecessary sequential constraints. In this paper, we propose a new multi-modal gait recognition framework based on silhouette and pose features to overcome these problems. Joint features of silhouettes and poses provide high discriminability and robustness to the pedestrians' clothing and carryings. Furthermore, we propose a set transformer model with a temporal aggregation operation for obtaining set-level spatio-temporal features. The temporal modeling approach is unaffected by frame permutations and can seamlessly integrate frames from different videos acquired in different scenarios, such as diverse viewing angles. Experiments on two public datasets, CASIA-B and GREW, demonstrate that the proposed method provides state-of-the-art performance. Under the most challenging condition of walking in different clothes on CASIA-B, the proposed method achieves a rank-1 accuracy of 85.8%, outperforming other methods by a significant margin (> 4%).
引用
收藏
页码:1535 / 1547
页数:13
相关论文
共 50 条
  • [31] Gait recognition using free-area transformer networks
    Chen, Guannan
    Wei, Shimin
    MACHINE VISION AND APPLICATIONS, 2023, 34 (06)
  • [32] Gait recognition using free-area transformer networks
    Guannan Chen
    Shimin Wei
    Machine Vision and Applications, 2023, 34
  • [33] Research on Multimodal Sentiment Classification of Internet Memes Based on Transformer
    Chi, Shengnan
    Sang, Guoming
    Shi, Xian
    PROCEEDINGS OF 2024 3RD INTERNATIONAL CONFERENCE ON CRYPTOGRAPHY, NETWORK SECURITY AND COMMUNICATION TECHNOLOGY, CNSCT 2024, 2024, : 445 - 450
  • [34] Learning Mutual Correlation in Multimodal Transformer for Speech Emotion Recognition
    Wang, Yuhua
    Shen, Guang
    Xu, Yuezhu
    Li, Jiahang
    Zhao, Zhengdao
    INTERSPEECH 2021, 2021, : 4518 - 4522
  • [35] Multimodal Locally Enhanced Transformer for Continuous Sign Language Recognition
    Papadimitriou, Katerina
    Potamianos, Gerasimos
    INTERSPEECH 2023, 2023, : 1513 - 1517
  • [36] Multimodal Interaction Fusion Network Based on Transformer for Video Captioning
    Xu, Hui
    Zeng, Pengpeng
    Khan, Abdullah Aman
    ARTIFICIAL INTELLIGENCE AND ROBOTICS, ISAIR 2022, PT I, 2022, 1700 : 21 - 36
  • [37] TNTC: TWO-STREAM NETWORK WITH TRANSFORMER-BASED COMPLEMENTARITY FOR GAIT-BASED EMOTION RECOGNITION
    Hu, Chuanfei
    Sheng, Weijie
    Dong, Bo
    Li, Xinde
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3229 - 3233
  • [38] GaitSet: Cross-View Gait Recognition Through Utilizing Gait As a Deep Set
    Chao, Hanqing
    Wang, Kun
    He, Yiwei
    Zhang, Junping
    Feng, Jianfeng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3467 - 3478
  • [39] LLE based gait recognition
    Li, HG
    Shi, CP
    Li, XG
    PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, 2005, : 4516 - 4521
  • [40] Context based gait recognition
    Bazazian, Shermin
    Gavrilova, Marina
    MULTISENSOR, MULTISOURCE INFORMATION FUSION: ARCHITECTURES, ALGORITHMS, AND APPLICATIONS 2012, 2012, 8407