Transformer for Object Re-identification: A Survey

被引:1
作者
Ye, Mang [1 ]
Chen, Shuoyi [1 ]
Li, Chenyue [1 ]
Zheng, Wei-Shi [2 ]
Crandall, David [3 ]
Du, Bo [1 ]
机构
[1] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Sch Comp Sci, Hubei Luojia Lab, Wuhan, Peoples R China
[2] Sun Yat sen Univ, Sch Data & Comp Sci, Guangzhou, Peoples R China
[3] Indiana Univ, Luddy Sch Informat Comp & Engn, Bloomington, IN USA
基金
中国国家自然科学基金;
关键词
Object Re-Identification; Transformer; Survey; Person Re-Identification; Deep Learning; PERSON REIDENTIFICATION; VEHICLE REIDENTIFICATION; ALIGNMENT; VISION;
D O I
10.1007/s11263-024-02284-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object Re-identification (Re-ID) aims to identify specific objects across different times and scenes, which is a widely researched task in computer vision. For a prolonged period, this field has been predominantly driven by deep learning technology based on convolutional neural networks. In recent years, the emergence of Vision Transformers has spurred a growing number of studies delving deeper into Transformer-based Re-ID, continuously breaking performance records and witnessing significant progress in the Re-ID field. Offering a powerful, flexible, and unified solution, Transformers cater to a wide array of Re-ID tasks with unparalleled efficacy. This paper provides a comprehensive review and in-depth analysis of the Transformer-based Re-ID. In categorizing existing works into Image/Video-Based Re-ID, Re-ID with limited data/annotations, Cross-Modal Re-ID, and Special Re-ID Scenarios, we thoroughly elucidate the advantages demonstrated by the Transformer in addressing a multitude of challenges across these domains. Considering the trending unsupervised Re-ID, we propose a new Transformer baseline, UntransReID, achieving state-of-the-art performance on both single/cross modal tasks. For the under-explored animal Re-ID, we devise a standardized experimental benchmark and conduct extensive experiments to explore the applicability of Transformer for this task and facilitate future research. Finally, we discuss some important yet under-investigated open issues in the large foundation model era, we believe it will serve as a new handbook for researchers in this field. A periodically updated website will be available at https://github.com/mangye16/ReID-Survey.
引用
收藏
页码:2410 / 2440
页数:31
相关论文
共 265 条
  • [11] Cao M, 2024, AAAI CONF ARTIF INTE, P465
  • [12] Emerging Properties in Self-Supervised Vision Transformers
    Caron, Mathilde
    Touvron, Hugo
    Misra, Ishan
    Jegou, Herve
    Mairal, Julien
    Bojanowski, Piotr
    Joulin, Armand
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9630 - 9640
  • [13] Honeybee Re-identification in Video: New Datasets and Impact of Self-supervision
    Chan, Jeffrey
    Carrion, Hector
    Megret, Remi
    Rivera Rivera, Jose L.
    Giray, Tugrul
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 517 - 525
  • [14] Advanced image recognition: a fully automated, high-accuracy photo-identification matching system for humpback whales
    Cheeseman, Ted
    Southerland, Ken
    Park, Jinmo
    Olio, Marilia
    Flynn, Kiirsten
    Calambokidis, John
    Jones, Lindsey
    Garrigue, Claire
    Frisch Jordan, Astrid
    Howard, Addison
    Reade, Walter
    Neilson, Janet
    Gabriele, Christine
    Clapham, Phil
    [J]. MAMMALIAN BIOLOGY, 2022, 102 (03) : 915 - 929
  • [15] Mixed High-Order Attention Network for Person Re-Identification
    Chen, Binghui
    Deng, Weihong
    Hu, Jiani
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 371 - 381
  • [16] Sketch Transformer: Asymmetrical Disentanglement Learning from Dynamic Synthesis
    Chen, Cuiqun
    Ye, Mang
    Qi, Meibin
    Du, Bo
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4012 - 4020
  • [17] Towards Modality-Agnostic Person Re-identification with Descriptive Query
    Chen, Cuiqun
    Ye, Mang
    Jiang, Ding
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15128 - 15137
  • [18] Structure-Aware Positional Transformer for Visible-Infrared Person Re-Identification
    Chen, Cuiqun
    Ye, Mang
    Qi, Meibin
    Wu, Jingjing
    Jiang, Jianguo
    Lin, Chia-Wen
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 2352 - 2364
  • [19] ICE: Inter-instance Contrastive Encoding for Unsupervised Person Re-identification
    Chen, Hao
    Lagadec, Benoit
    Bremond, Francois
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14940 - 14949
  • [20] Rotation Invariant Transformer for Recognizing Object in UAVs
    Chen, Shuoyi
    Ye, Mang
    Du, Bo
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 2565 - 2574