Self-Supervised Modality-Aware Multiple Granularity Pre-Training for RGB-Infrared Person Re-Identification

被引:5
|
作者
Wan, Lin [1 ]
Jing, Qianyan [1 ]
Sun, Zongyuan [1 ]
Zhang, Chuang [2 ]
Li, Zhihang [3 ]
Chen, Yehansen [1 ]
机构
[1] China Univ Geosci, Sch Comp Sci, Wuhan 430078, Peoples R China
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[3] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
关键词
Task analysis; Training; Feature extraction; Lighting; Cameras; Visualization; Self-supervised learning; Cross-modality person re-identification; self-supervised learning; multi-modality pre-training;
D O I
10.1109/TIFS.2023.3273911
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
RGB-Infrared person re-identification (RGB-IR ReID) aims to associate people across disjoint RGB and IR camera views. Currently, state-of-the-art performance of RGB-IR ReID is not as impressive as that of conventional ReID. Much of that is due to the notorious modality bias training issue brought by the single-modality ImageNet pre-training, which might yield RGB-biased representations that severely hinder the cross-modality image retrieval. This paper makes first attempt to tackle the task from a pre-training perspective. We propose a self-supervised pre-training solution, named Modality-Aware Multiple Granularity Learning (MMGL), which directly trains models from scratch only on multi-modal ReID datasets, but achieving competitive results against ImageNet pre-training, without using any external data or sophisticated tuning tricks. First, we develop a simple-but-effective 'permutation recovery' pretext task that globally maps shuffled RGB-IR images into a shared latent permutation space, providing modality-invariant global representations for downstream ReID tasks. Second, we present a part-aware cycle-contrastive (PCC) learning strategy that utilizes cross-modality cycle-consistency to maximize agreement between semantically similar RGB-IR image patches. This enables contrastive learning for the unpaired multi-modal scenarios, further improving the discriminability of local features without laborious instance augmentation. Based on these designs, MMGL effectively alleviates the modality bias training problem. Extensive experiments demonstrate that it learns better representations (+8.03% Rank-1 accuracy) with faster training speed (converge only in few hours) and higher data efficiency (< 5% data size) than ImageNet pre-training. The results also suggest it generalizes well to various existing models, losses and has promising transferability across datasets. The code will be released at https://github.com/hansonchen1996/MMGL.
引用
收藏
页码:3044 / 3057
页数:14
相关论文
共 26 条
  • [21] Diffusion Augmentation and Pose Generation Based Pre-Training Method for Robust Visible-Infrared Person Re-Identification
    Sun, Rui
    Huang, Guoxi
    Xie, Ruirui
    Wang, Xuebin
    Chen, Long
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2670 - 2674
  • [22] Learn from restoration: exploiting task-oriented knowledge distillation in self-supervised person re-identification
    Yang, Enze
    Liu, Yuxin
    Zhao, Shitao
    Liu, Yiran
    Liu, Shuoyan
    VISUAL COMPUTER, 2025, : 6313 - 6326
  • [23] Enhancing New Multiple Sclerosis Lesion Segmentation via Self-supervised Pre-training and Synthetic Lesion Integration
    Tahghighi, Peyman
    Zhang, Yunyan
    Souza, Roberto
    Komeili, Amin
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT VIII, 2024, 15008 : 263 - 272
  • [24] A Self-Supervised Gait Encoding Approach With Locality-Awareness for 3D Skeleton Based Person Re-Identification
    Rao, Haocong
    Wang, Siqi
    Hu, Xiping
    Tan, Mingkui
    Guo, Yi
    Cheng, Jun
    Liu, Xinwang
    Hu, Bin
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 6649 - 6666
  • [25] Enhancing Visible-Infrared Person Re-identification with Modality- and Instance-aware Visual Prompt Learning
    Wu, Ruiqi
    Jiao, Bingliang
    Wang, Wenxuan
    Liu, Meng
    Wang, Peng
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 579 - 588
  • [26] Infrared-Visible Person Re-Identification via Multi-Modality Feature Fusion and Self-Distillation
    Wan, Lei
    Li, Huafeng
    Zhang, Yafei
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2024, 36 (07): : 1065 - 1076