Transformer for Object Re-identification: A Survey

被引:1
作者
Ye, Mang [1 ]
Chen, Shuoyi [1 ]
Li, Chenyue [1 ]
Zheng, Wei-Shi [2 ]
Crandall, David [3 ]
Du, Bo [1 ]
机构
[1] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Sch Comp Sci, Hubei Luojia Lab, Wuhan, Peoples R China
[2] Sun Yat sen Univ, Sch Data & Comp Sci, Guangzhou, Peoples R China
[3] Indiana Univ, Luddy Sch Informat Comp & Engn, Bloomington, IN USA
基金
中国国家自然科学基金;
关键词
Object Re-Identification; Transformer; Survey; Person Re-Identification; Deep Learning; PERSON REIDENTIFICATION; VEHICLE REIDENTIFICATION; ALIGNMENT; VISION;
D O I
10.1007/s11263-024-02284-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object Re-identification (Re-ID) aims to identify specific objects across different times and scenes, which is a widely researched task in computer vision. For a prolonged period, this field has been predominantly driven by deep learning technology based on convolutional neural networks. In recent years, the emergence of Vision Transformers has spurred a growing number of studies delving deeper into Transformer-based Re-ID, continuously breaking performance records and witnessing significant progress in the Re-ID field. Offering a powerful, flexible, and unified solution, Transformers cater to a wide array of Re-ID tasks with unparalleled efficacy. This paper provides a comprehensive review and in-depth analysis of the Transformer-based Re-ID. In categorizing existing works into Image/Video-Based Re-ID, Re-ID with limited data/annotations, Cross-Modal Re-ID, and Special Re-ID Scenarios, we thoroughly elucidate the advantages demonstrated by the Transformer in addressing a multitude of challenges across these domains. Considering the trending unsupervised Re-ID, we propose a new Transformer baseline, UntransReID, achieving state-of-the-art performance on both single/cross modal tasks. For the under-explored animal Re-ID, we devise a standardized experimental benchmark and conduct extensive experiments to explore the applicability of Transformer for this task and facilitate future research. Finally, we discuss some important yet under-investigated open issues in the large foundation model era, we believe it will serve as a new handbook for researchers in this field. A periodically updated website will be available at https://github.com/mangye16/ReID-Survey.
引用
收藏
页码:2410 / 2440
页数:31
相关论文
共 265 条
  • [1] Ahmed Ejaz, 2015, PROC CVPR IEEE, DOI [DOI 10.1109/CVPR.2015.7299016, 10.1109/CVPR.2015.7299016]
  • [2] [Anonymous], 2022, LEOPARD ID 2022
  • [3] [Anonymous], 2017, P IEEE C COMPUTER VI, DOI DOI 10.1109/CVPR.2017.357
  • [4] Person30K: A Dual-Meta Generalization Network for Person Re-Identification
    Bai, Yan
    Jiao, Jile
    Ce, Wang
    Liu, Jun
    Lou, Yihang
    Feng, Xuetao
    Duan, Ling-Yu
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2123 - 2132
  • [5] Unsupervised Multi-Source Domain Adaptation for Person Re-Identification
    Bai, Zechen
    Wang, Zhigang
    Wang, Jian
    Hu, Di
    Ding, Errui
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12909 - 12918
  • [6] Multi-views Embedding for Cattle Re-identification
    Bergamini, Luca
    Porrello, Angelo
    Dondona, Andrea Capobianco
    Del Negro, Ercole
    Mattioli, Mauro
    D'Alterio, Nicola
    Calderara, Simone
    [J]. 2018 14TH INTERNATIONAL CONFERENCE ON SIGNAL IMAGE TECHNOLOGY & INTERNET BASED SYSTEMS (SITIS), 2018, : 184 - 191
  • [7] Towards Grand Unified Representation Learning for Unsupervised Visible-Infrared Person Re-Identification
    Bin Yang
    Chen, Jun
    Ye, Mang
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11035 - 11045
  • [8] Bouma S, 2018, INT CONF IMAG VIS
  • [9] Brown T., 2020, Advances in neural information processing systems, V33, P1877, DOI DOI 10.48550/ARXIV.2005.14165
  • [10] PSTR: End-to-End One-Step Person Search With Transformers
    Cao, Jiale
    Pang, Yanwei
    Anwer, Rao Muhammad
    Cholakkal, Hisham
    Xie, Jin
    Shah, Mubarak
    Khan, Fahad Shahbaz
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 9448 - 9457