Prototype-Guided Attention Distillation for Discriminative Person Search

被引:0
作者
Kim, Hanjae [1 ]
Lee, Jiyoung [2 ]
Sohn, Kwanghoon [1 ,3 ]
机构
[1] Yonsei Univ, Sch Elect & Elect Engn, Seoul 03722, South Korea
[2] NAVER AI Lab, Seongnam 13561, South Korea
[3] Korea Inst Sci & Technol KIST, Seoul 02792, South Korea
基金
新加坡国家研究基金会;
关键词
Prototypes; Transformers; Proposals; Detectors; Training; Noise; Head; Person search; person re-identification; attention distillation; NETWORK;
D O I
10.1109/TPAMI.2024.3461778
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Person search aims to localize a person of interest in a large image gallery captured by multiple, non-overlapping cameras. Prevalent unified methods have suffered from (1) noisy proposals with mis-detection and occlusion, and (2) large appearance variation within a class, which deteriorates the prototype-based metric learning. To address these problems, we introduce a Prototype-guided Attention Distillation, shortly PAD, which exploits a prototype (a typical representation of an identity) as a guidance to the attention module to consistently highlight identity-inherent regions across different poses. To utilize the knowledge encoded in prototypes for matching unseen IDs, PAD conducts attention distillation to guide student Re-ID queries by deeply mimicking attention maps from the prototype query. Additionally, to address large intra-class variation induced by pose or camera views, we extend PAD with multiple part prototypes representing consistent local regions across different instances. Furthermore, we exploit an adaptive momentum strategy for robust attention distillation in PAD to update more distinct prototypes. Extensive experiments conducted on CUHK-SYSU and PRW demonstrate the effectiveness of PAD, showcasing state-of-the-art performance. Moreover, our distilled attention surprisingly highlights distinguished multiple regions for person search.
引用
收藏
页码:99 / 115
页数:17
相关论文
共 121 条
  • [1] Islam MA, 2020, Arxiv, DOI arXiv:2001.08248
  • [2] Ba J.L., 2016, arXiv
  • [3] Cascade R-CNN: Delving into High Quality Object Detection
    Cai, Zhaowei
    Vasconcelos, Nuno
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6154 - 6162
  • [4] PSTR: End-to-End One-Step Person Search With Transformers
    Cao, Jiale
    Pang, Yanwei
    Anwer, Rao Muhammad
    Cholakkal, Hisham
    Xie, Jin
    Shah, Mubarak
    Khan, Fahad Shahbaz
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 9448 - 9457
  • [5] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [6] Emerging Properties in Self-Supervised Vision Transformers
    Caron, Mathilde
    Touvron, Hugo
    Misra, Ishan
    Jegou, Herve
    Mairal, Julien
    Bojanowski, Piotr
    Joulin, Armand
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9630 - 9640
  • [7] RCAA: Relational Context-Aware Agents for Person Search
    Chang, Xiaojun
    Huang, Po-Yao
    Shen, Yi-Dong
    Liang, Xiaodan
    Yang, Yi
    Hauptmann, Alexander G.
    [J]. COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 : 86 - 102
  • [8] A Detection Method for the Resource Misuses in Information Systems
    Wang, Chao
    Zhang, Gaoyu
    Liu, Lan
    [J]. 2010 INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT (CCCM2010), VOL II, 2010, : 531 - 534
  • [9] Chen D, 2020, AAAI CONF ARTIF INTE, V34, P10518
  • [10] Person Search via a Mask-Guided Two-Stream CNN Model
    Chen, Di
    Zhang, Shanshan
    Ouyang, Wanli
    Yang, Jian
    Tai, Ying
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 764 - 781