Auxiliary Representation Guided Network for Visible-Infrared Person Re-Identification

被引：0

作者：

Qi, Mengzan ^{[1
]}

Chan, Sixian ^{[2
]}

Hang, Chen ^{[1
]}

Zhang, Guixu ^{[1
]}

Zeng, Tieyong ^{[3
]}

Li, Zhi ^{[1
]}

机构：

[1] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200062, Peoples R China

[2] ZheJiang Univ Technol, Sch Comp Sci & Technol, Hangzhou 310027, Peoples R China

[3] Chinese Univ Hong Kong, Dept Math, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2025年 / 27卷

基金：

中国国家自然科学基金;

关键词：

Feature extraction; Identification of persons; Learning systems; Cameras; Representation learning; Hands; Visualization; Transformers; Semantics; Robustness; Visible-infrared person re-identification; cross-modality discrepancy; auxiliary representation;

D O I：

10.1109/TMM.2024.3521773

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Visible-Infrared Person Re-identification aims to retrieve images of specific identities across modalities. To relieve the large cross-modality discrepancy, researchers introduce the auxiliary modality within the image space to assist modality-invariant representation learning. However, the challenge persists in constraining the inherent quality of generated auxiliary images, further leading to a bottleneck in retrieval performance. In this paper, we propose a novel Auxiliary Representation Guided Network (ARGN) to explore the potential of auxiliary representations, which are directly generated within the modality-shared embedding space. In contrast to the original visible and infrared representations, which contain information solely from their respective modalities, these auxiliary representations integrate cross-modality information by fusing both modalities. In our framework, we utilize these auxiliary representations as modality guidance to reduce the cross-modality discrepancy. First, we propose a High-quality Auxiliary Representation Learning (HARL) framework to generate identity-consistent auxiliary representations. The primary objective of our HARL is to ensure that auxiliary representations capture diverse modality information from both modalities while concurrently preserving identity-related discrimination. Second, guided by these auxiliary representations, we design an Auxiliary Representation Guided Constraint (ARGC) to optimize the modality-shared embedding space. By incorporating this constraint, the modality-shared embedding space is optimized to achieve enhanced intra-identity compactness and inter-identity separability, further improving the retrieval performance. In addition, to improve the robustness of our framework against the modality variation, we introduce a Part-based Adaptive Gaussian Module (PAGM) to adaptively extract discriminative information across modalities. Finally, extensive experiments are conducted to demonstrate the superiority of our method over state-of-the-art approaches on three VI-ReID datasets.

引用

页码：340 / 355

页数：16

共 63 条

[1] A survey of approaches and trends in person re-identification [J].

Bedagkar-Gala, Apurva ;

Shah, Shishir K. .

IMAGE AND VISION COMPUTING, 2014, 32 (04) :270-286

[2] Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval [J].

Brown, Andrew ;

Xie, Weidi ;

Kalogeiton, Vicky ;

Zisserman, Andrew .

COMPUTER VISION - ECCV 2020, PT IX, 2020, 12354 :677-694

[3] Neural Feature Search for RGB-Infrared Person Re-Identification [J].

Chen, Yehansen ;

Wan, Lin ;

Li, Zhihang ;

Jing, Qianyan ;

Sun, Zongyuan .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :587-597

[4] Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification [J].

Choi, Seokeon ;

Lee, Sumin ;

Kim, Youngeun ;

Kim, Taekyung ;

Kim, Changick .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10254-10263

[5]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[6] Shape-Erased Feature Learning for Visible-Infrared Person Re-Identification [J].

Feng, Jiawei ;

Wu, Ancong ;

Zhen, Wei-Shi .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :22752-22761

[7]

Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672

[8] Dual-alignment Feature Embedding for Cross-modality Person Re-identification [J].

Hao, Yi ;

Wang, Nannan ;

Gao, Xinbo ;

Li, Jie ;

Wang, Xiaoyu .

PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, :57-65

[9] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[10]

Hermans A, 2017, Arxiv, DOI arXiv:1703.07737

← 1 2 3 4 5 6 7 →