NFormer: Robust Person Re-identification with Neighbor Transformer

被引:122
作者
Wang, Haochen [1 ]
Shen, Jiayi [1 ]
Liu, Yongtuo [1 ]
Gao, Yan [2 ]
Gavves, Efstratios [1 ]
机构
[1] Univ Amsterdam, Amsterdam, Netherlands
[2] Xiaohongshu Inc, Shanghai, Peoples R China
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2022年
关键词
NETWORK;
D O I
10.1109/CVPR52688.2022.00715
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Person re-identification aims to retrieve persons in highly varying settings across different cameras and scenarios, in which robust and discriminative representation learning is crucial. Most research considers learning representations from single images, ignoring any potential interactions between them. However, due to the high intra-identity variations, ignoring such interactions typically leads to outlier features. To tackle this issue, we propose a Neighbor Transformer Network, or NFormer, which explicitly models interactions across all input images, thus suppressing outlier features and leading to more robust representations overall. As modelling interactions between enormous amount of images is a massive task with lots of distractors, NFormer introduces two novel modules, the Landmark Agent Attention, and the Reciprocal Neighbor Softmax. Specifically, the Landmark Agent Attention efficiently models the relation map between images by a low-rank factorization with a few landmarks in feature space. Moreover, the Reciprocal Neighbor Softmax achieves sparse attention to relevant-rather than all-neighbors only, which alleviates interference of irrelevant representations and further relieves the computational burden. In experiments on four large-scale datasets, NFormer achieves a new state-of-the-art. The code is released at https:// github.com/haochenheheda/NFormer.
引用
收藏
页码:7287 / 7297
页数:11
相关论文
共 52 条
[1]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[2]   Mixed High-Order Attention Network for Person Re-Identification [J].
Chen, Binghui ;
Deng, Weihong ;
Hu, Jiani .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :371-381
[3]   Group Consistent Similarity Learning via Deep CRF for Person Re-Identification [J].
Chen, Dapeng ;
Xu, Dan ;
Li, Hongsheng ;
Sebe, Nicu ;
Wang, Xiaogang .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :8649-8658
[4]   ABD-Net: Attentive but Diverse Person Re-Identification [J].
Chen, Tianlong ;
Ding, Shaojin ;
Xie, Jingyi ;
Yuan, Ye ;
Chen, Wuyang ;
Yang, Yang ;
Ren, Zhou ;
Wang, Zhangyang .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :8350-8360
[5]   Transformer Tracking [J].
Chen, Xin ;
Yan, Bin ;
Zhu, Jiawen ;
Wang, Dong ;
Yang, Xiaoyun ;
Lu, Huchuan .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :8122-8131
[6]   Person Re-Identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function [J].
Cheng, De ;
Gong, Yihong ;
Zhou, Sanping ;
Wang, Jinjun ;
Zheng, Nanning .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1335-1344
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]  
Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
[9]   Bilinear Attention Networks for Person Retrieval [J].
Fang, Pengfei ;
Zhou, Jieming ;
Roy, Soumava Kumar ;
Petersson, Lars ;
Harandi, Mehrtash .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :8029-8038
[10]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778