Auxiliary Representation Guided Network for Visible-Infrared Person Re-Identification

被引:0
作者
Qi, Mengzan [1 ]
Chan, Sixian [2 ]
Hang, Chen [1 ]
Zhang, Guixu [1 ]
Zeng, Tieyong [3 ]
Li, Zhi [1 ]
机构
[1] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200062, Peoples R China
[2] ZheJiang Univ Technol, Sch Comp Sci & Technol, Hangzhou 310027, Peoples R China
[3] Chinese Univ Hong Kong, Dept Math, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Identification of persons; Learning systems; Cameras; Representation learning; Hands; Visualization; Transformers; Semantics; Robustness; Visible-infrared person re-identification; cross-modality discrepancy; auxiliary representation;
D O I
10.1109/TMM.2024.3521773
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Visible-Infrared Person Re-identification aims to retrieve images of specific identities across modalities. To relieve the large cross-modality discrepancy, researchers introduce the auxiliary modality within the image space to assist modality-invariant representation learning. However, the challenge persists in constraining the inherent quality of generated auxiliary images, further leading to a bottleneck in retrieval performance. In this paper, we propose a novel Auxiliary Representation Guided Network (ARGN) to explore the potential of auxiliary representations, which are directly generated within the modality-shared embedding space. In contrast to the original visible and infrared representations, which contain information solely from their respective modalities, these auxiliary representations integrate cross-modality information by fusing both modalities. In our framework, we utilize these auxiliary representations as modality guidance to reduce the cross-modality discrepancy. First, we propose a High-quality Auxiliary Representation Learning (HARL) framework to generate identity-consistent auxiliary representations. The primary objective of our HARL is to ensure that auxiliary representations capture diverse modality information from both modalities while concurrently preserving identity-related discrimination. Second, guided by these auxiliary representations, we design an Auxiliary Representation Guided Constraint (ARGC) to optimize the modality-shared embedding space. By incorporating this constraint, the modality-shared embedding space is optimized to achieve enhanced intra-identity compactness and inter-identity separability, further improving the retrieval performance. In addition, to improve the robustness of our framework against the modality variation, we introduce a Part-based Adaptive Gaussian Module (PAGM) to adaptively extract discriminative information across modalities. Finally, extensive experiments are conducted to demonstrate the superiority of our method over state-of-the-art approaches on three VI-ReID datasets.
引用
收藏
页码:340 / 355
页数:16
相关论文
共 63 条
[1]   A survey of approaches and trends in person re-identification [J].
Bedagkar-Gala, Apurva ;
Shah, Shishir K. .
IMAGE AND VISION COMPUTING, 2014, 32 (04) :270-286
[2]   Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval [J].
Brown, Andrew ;
Xie, Weidi ;
Kalogeiton, Vicky ;
Zisserman, Andrew .
COMPUTER VISION - ECCV 2020, PT IX, 2020, 12354 :677-694
[3]   Neural Feature Search for RGB-Infrared Person Re-Identification [J].
Chen, Yehansen ;
Wan, Lin ;
Li, Zhihang ;
Jing, Qianyan ;
Sun, Zongyuan .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :587-597
[4]   Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification [J].
Choi, Seokeon ;
Lee, Sumin ;
Kim, Youngeun ;
Kim, Taekyung ;
Kim, Changick .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10254-10263
[5]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[6]   Shape-Erased Feature Learning for Visible-Infrared Person Re-Identification [J].
Feng, Jiawei ;
Wu, Ancong ;
Zhen, Wei-Shi .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :22752-22761
[7]  
Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[8]   Dual-alignment Feature Embedding for Cross-modality Person Re-identification [J].
Hao, Yi ;
Wang, Nannan ;
Gao, Xinbo ;
Li, Jie ;
Wang, Xiaoyu .
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, :57-65
[9]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[10]  
Hermans A, 2017, Arxiv, DOI arXiv:1703.07737