Transformer for Object Re-identification: A Survey

被引：1

作者：

Ye, Mang ^{[1
]}

Chen, Shuoyi ^{[1
]}

Li, Chenyue ^{[1
]}

Zheng, Wei-Shi ^{[2
]}

Crandall, David ^{[3
]}

Du, Bo ^{[1
]}

机构：

[1] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Sch Comp Sci, Hubei Luojia Lab, Wuhan, Peoples R China

[2] Sun Yat sen Univ, Sch Data & Comp Sci, Guangzhou, Peoples R China

[3] Indiana Univ, Luddy Sch Informat Comp & Engn, Bloomington, IN USA

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2025年 / 133卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Object Re-Identification; Transformer; Survey; Person Re-Identification; Deep Learning; PERSON REIDENTIFICATION; VEHICLE REIDENTIFICATION; ALIGNMENT; VISION;

D O I：

10.1007/s11263-024-02284-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Object Re-identification (Re-ID) aims to identify specific objects across different times and scenes, which is a widely researched task in computer vision. For a prolonged period, this field has been predominantly driven by deep learning technology based on convolutional neural networks. In recent years, the emergence of Vision Transformers has spurred a growing number of studies delving deeper into Transformer-based Re-ID, continuously breaking performance records and witnessing significant progress in the Re-ID field. Offering a powerful, flexible, and unified solution, Transformers cater to a wide array of Re-ID tasks with unparalleled efficacy. This paper provides a comprehensive review and in-depth analysis of the Transformer-based Re-ID. In categorizing existing works into Image/Video-Based Re-ID, Re-ID with limited data/annotations, Cross-Modal Re-ID, and Special Re-ID Scenarios, we thoroughly elucidate the advantages demonstrated by the Transformer in addressing a multitude of challenges across these domains. Considering the trending unsupervised Re-ID, we propose a new Transformer baseline, UntransReID, achieving state-of-the-art performance on both single/cross modal tasks. For the under-explored animal Re-ID, we devise a standardized experimental benchmark and conduct extensive experiments to explore the applicability of Transformer for this task and facilitate future research. Finally, we discuss some important yet under-investigated open issues in the large foundation model era, we believe it will serve as a new handbook for researchers in this field. A periodically updated website will be available at https://github.com/mangye16/ReID-Survey.

引用

页码：2410 / 2440

页数：31

共 265 条

[11] Cao M, 2024, AAAI CONF ARTIF INTE, P465
[12] Emerging Properties in Self-Supervised Vision Transformers
Caron, Mathilde
Touvron, Hugo
Misra, Ishan
Jegou, Herve
Mairal, Julien
Bojanowski, Piotr
Joulin, Armand
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9630 - 9640
[13] Honeybee Re-identification in Video: New Datasets and Impact of Self-supervision
Chan, Jeffrey
Carrion, Hector
Megret, Remi
Rivera Rivera, Jose L.
Giray, Tugrul
[J]. PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 517 - 525
[14] Advanced image recognition: a fully automated, high-accuracy photo-identification matching system for humpback whales
Cheeseman, Ted
Southerland, Ken
Park, Jinmo
Olio, Marilia
Flynn, Kiirsten
Calambokidis, John
Jones, Lindsey
Garrigue, Claire
Frisch Jordan, Astrid
Howard, Addison
Reade, Walter
Neilson, Janet
Gabriele, Christine
Clapham, Phil
[J]. MAMMALIAN BIOLOGY, 2022, 102 (03) : 915 - 929
[15] Mixed High-Order Attention Network for Person Re-Identification
Chen, Binghui
Deng, Weihong
Hu, Jiani
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 371 - 381
[16] Sketch Transformer: Asymmetrical Disentanglement Learning from Dynamic Synthesis
Chen, Cuiqun
Ye, Mang
Qi, Meibin
Du, Bo
[J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4012 - 4020
[17] Towards Modality-Agnostic Person Re-identification with Descriptive Query
Chen, Cuiqun
Ye, Mang
Jiang, Ding
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15128 - 15137
[18] Structure-Aware Positional Transformer for Visible-Infrared Person Re-Identification
Chen, Cuiqun
Ye, Mang
Qi, Meibin
Wu, Jingjing
Jiang, Jianguo
Lin, Chia-Wen
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 2352 - 2364
[19] ICE: Inter-instance Contrastive Encoding for Unsupervised Person Re-identification
Chen, Hao
Lagadec, Benoit
Bremond, Francois
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14940 - 14949
[20] Rotation Invariant Transformer for Recognizing Object in UAVs
Chen, Shuoyi
Ye, Mang
Du, Bo
[J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 2565 - 2574

← 1 2 3 4 5 6 7 8 9 10 →