Self-Supervised Modality-Aware Multiple Granularity Pre-Training for RGB-Infrared Person Re-Identification

被引：5

作者：

Wan, Lin ^{[1
]}

Jing, Qianyan ^{[1
]}

Sun, Zongyuan ^{[1
]}

Zhang, Chuang ^{[2
]}

Li, Zhihang ^{[3
]}

Chen, Yehansen ^{[1
]}

机构：

[1] China Univ Geosci, Sch Comp Sci, Wuhan 430078, Peoples R China

[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China

[3] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China

来源：

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY | 2023年 / 18卷

关键词：

Task analysis; Training; Feature extraction; Lighting; Cameras; Visualization; Self-supervised learning; Cross-modality person re-identification; self-supervised learning; multi-modality pre-training;

D O I：

10.1109/TIFS.2023.3273911

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

RGB-Infrared person re-identification (RGB-IR ReID) aims to associate people across disjoint RGB and IR camera views. Currently, state-of-the-art performance of RGB-IR ReID is not as impressive as that of conventional ReID. Much of that is due to the notorious modality bias training issue brought by the single-modality ImageNet pre-training, which might yield RGB-biased representations that severely hinder the cross-modality image retrieval. This paper makes first attempt to tackle the task from a pre-training perspective. We propose a self-supervised pre-training solution, named Modality-Aware Multiple Granularity Learning (MMGL), which directly trains models from scratch only on multi-modal ReID datasets, but achieving competitive results against ImageNet pre-training, without using any external data or sophisticated tuning tricks. First, we develop a simple-but-effective 'permutation recovery' pretext task that globally maps shuffled RGB-IR images into a shared latent permutation space, providing modality-invariant global representations for downstream ReID tasks. Second, we present a part-aware cycle-contrastive (PCC) learning strategy that utilizes cross-modality cycle-consistency to maximize agreement between semantically similar RGB-IR image patches. This enables contrastive learning for the unpaired multi-modal scenarios, further improving the discriminability of local features without laborious instance augmentation. Based on these designs, MMGL effectively alleviates the modality bias training problem. Extensive experiments demonstrate that it learns better representations (+8.03% Rank-1 accuracy) with faster training speed (converge only in few hours) and higher data efficiency (< 5% data size) than ImageNet pre-training. The results also suggest it generalizes well to various existing models, losses and has promising transferability across datasets. The code will be released at https://github.com/hansonchen1996/MMGL.

引用

页码：3044 / 3057

页数：14

共 26 条

[1] Self-supervised Pre-training with Learnable Tokenizers for Person Re-Identification in Railway Stations
Yang, Enze
Li, Chao
Liu, Shuoyan
Liu, Yuxin
Zhao, Shitao
Huang, Nan
2022 16TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP2022), VOL 1, 2022, : 325 - 330
[2] Efficient Cross-Modality Graph Reasoning for RGB-Infrared Person Re-Identification
Yujian, Feng
Chen, Feng
Ji, Yi-mu
Wu, Fei
Sun, Jing
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1425 - 1429
[3] Self-Supervised Pre-training on the Target Domain for Cross-Domain Person Re-identification
Zhang, Junyin
Ge, Yongxin
Gu, Xinqian
Hua, Boyu
Xiang, Tao
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4268 - 4276
[4] Proxy-Based Embedding Alignment for RGB-Infrared Person Re-Identification
Dou, Zhaopeng
Sun, Yifan
Li, Yali
Wang, Shengjin
TSINGHUA SCIENCE AND TECHNOLOGY, 2025, 30 (03): : 1112 - 1124
[5] Cross-Modality Person Re-Identification via Modality-Aware Collaborative Ensemble Learning
Ye, Mang
Lan, Xiangyuan
Leng, Qingming
Shen, Jianbing
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 9387 - 9399
[6] Cross-Modal Cross-Domain Dual Alignment Network for RGB-Infrared Person Re-Identification
Fu, Xiaowei
Huang, Fuxiang
Zhou, Yuhang
Ma, Huimin
Xu, Xin
Zhang, Lei
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (10) : 6874 - 6887
[7] Homogeneous-to-Heterogeneous: Unsupervised Learning for RGB-Infrared Person Re-Identification
Liang, Wenqi
Wang, Guangcong
Lai, Jianhuang
Xie, Xiaohua
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 6392 - 6407
[8] Self-supervised data augmentation for person re-identification
Chen, Feng
Wang, Nian
Tang, Jun
Liang, Dong
Feng, Hao
NEUROCOMPUTING, 2020, 415 : 48 - 59
[9] Alleviating Modality Bias Training for Infrared-Visible Person Re-Identification
Huang, Yan
Wu, Qiang
Xu, Jingsong
Zhong, Yi
Zhang, Peng
Zhang, Zhaoxiang
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1570 - 1582
[10] Person Re-Identification With Self-Supervised Teacher for In-Box Noise
Seo, Yonghyeok
Kim, Seung-Hun
IEEE ACCESS, 2025, 13 : 39800 - 39812

← 1 2 3 →