Prototype-Guided Attention Distillation for Discriminative Person Search

被引：0

作者：

Kim, Hanjae ^{[1
]}

Lee, Jiyoung ^{[2
]}

Sohn, Kwanghoon ^{[1
,3
]}

机构：

[1] Yonsei Univ, Sch Elect & Elect Engn, Seoul 03722, South Korea

[2] NAVER AI Lab, Seongnam 13561, South Korea

[3] Korea Inst Sci & Technol KIST, Seoul 02792, South Korea

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2025年 / 47卷 / 01期

基金：

新加坡国家研究基金会;

关键词：

Prototypes; Transformers; Proposals; Detectors; Training; Noise; Head; Person search; person re-identification; attention distillation; NETWORK;

D O I：

10.1109/TPAMI.2024.3461778

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Person search aims to localize a person of interest in a large image gallery captured by multiple, non-overlapping cameras. Prevalent unified methods have suffered from (1) noisy proposals with mis-detection and occlusion, and (2) large appearance variation within a class, which deteriorates the prototype-based metric learning. To address these problems, we introduce a Prototype-guided Attention Distillation, shortly PAD, which exploits a prototype (a typical representation of an identity) as a guidance to the attention module to consistently highlight identity-inherent regions across different poses. To utilize the knowledge encoded in prototypes for matching unseen IDs, PAD conducts attention distillation to guide student Re-ID queries by deeply mimicking attention maps from the prototype query. Additionally, to address large intra-class variation induced by pose or camera views, we extend PAD with multiple part prototypes representing consistent local regions across different instances. Furthermore, we exploit an adaptive momentum strategy for robust attention distillation in PAD to update more distinct prototypes. Extensive experiments conducted on CUHK-SYSU and PRW demonstrate the effectiveness of PAD, showcasing state-of-the-art performance. Moreover, our distilled attention surprisingly highlights distinguished multiple regions for person search.

引用

页码：99 / 115

页数：17

共 121 条

[1] Islam MA, 2020, Arxiv, DOI arXiv:2001.08248
[2] Ba J.L., 2016, arXiv
[3] Cascade R-CNN: Delving into High Quality Object Detection
Cai, Zhaowei
Vasconcelos, Nuno
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6154 - 6162
[4] PSTR: End-to-End One-Step Person Search With Transformers
Cao, Jiale
Pang, Yanwei
Anwer, Rao Muhammad
Cholakkal, Hisham
Xie, Jin
Shah, Mubarak
Khan, Fahad Shahbaz
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 9448 - 9457
[5] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[6] Emerging Properties in Self-Supervised Vision Transformers
Caron, Mathilde
Touvron, Hugo
Misra, Ishan
Jegou, Herve
Mairal, Julien
Bojanowski, Piotr
Joulin, Armand
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9630 - 9640
[7] RCAA: Relational Context-Aware Agents for Person Search
Chang, Xiaojun
Huang, Po-Yao
Shen, Yi-Dong
Liang, Xiaodan
Yang, Yi
Hauptmann, Alexander G.
[J]. COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 : 86 - 102
[8] A Detection Method for the Resource Misuses in Information Systems
Wang, Chao
Zhang, Gaoyu
Liu, Lan
[J]. 2010 INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT (CCCM2010), VOL II, 2010, : 531 - 534
[9] Chen D, 2020, AAAI CONF ARTIF INTE, V34, P10518
[10] Person Search via a Mask-Guided Two-Stream CNN Model
Chen, Di
Zhang, Shanshan
Ouyang, Wanli
Yang, Jian
Tai, Ying
[J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 764 - 781

← 1 2 3 4 5 6 7 8 9 10 →