MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared Person Re-Identification

被引:46
作者
Gao, Yajun [1 ]
Liang, Tengfei [1 ]
Jin, Yi [1 ]
Gu, Xiaoyan [2 ]
Liu, Wu [3 ]
Li, Yidong [1 ]
Lang, Congyan [1 ]
机构
[1] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[3] JD AI Res, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年
基金
中国国家自然科学基金;
关键词
Person Re-identification; Cross-Modality; Multi-Feature Space; Joint; Optimization;
D O I
10.1145/3474085.3475643
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The RGB-infrared cross-modality person re-identification (ReID) task aims to recognize the images of the same identity between the visible modality and the infrared modality. Existing methods mainly use a two-stream architecture to eliminate the discrepancy between the two modalities in the final common feature space, which ignore the single space of each modality in the shallowlayers. To solve it, in this paper, we present a novel multi-feature space joint optimization (MSO) network, which can learn modality-sharable features in both the single-modality space and the common space. Firstly, based on the observation that edge information is modality-invariant, we propose an edge features enhancement module to enhance the modality-sharable features in each single-modality space. Specifically, we design a perceptual edge features (PEF) loss after the edge fusion strategy analysis. According to our knowledge, this is the first work that proposes explicit optimization in the singlemodality feature space on cross-modality ReID task. Moreover, to increase the difference between cross-modality distance and class distance, we introduce a novel cross-modality contrastive-center (CMCC) loss into the modality-joint constraints in the common feature space. The PEF loss and CMCC loss jointly optimize the model in an end-to-end manner, which markedly improves the network's performance. Extensive experiments demonstrate that the proposed model significantly outperforms state-of-the-art methods on both the SYSU-MM01 and RegDB datasets.
引用
收藏
页码:5257 / 5265
页数:9
相关论文
共 54 条
[51]   HPILN: a feature learning framework for cross-modality person re-identification [J].
Zhao, Yun-Bo ;
Lin, Jian-Wu ;
Xuan, Qi ;
Xi, Xugang .
IET IMAGE PROCESSING, 2019, 13 (14) :2897-2904
[52]  
Zheng Kecheng, 2021, GROUP AWARE LABEL TR
[53]   Person Re-identification in the Wild [J].
Zheng, Liang ;
Zhang, Hengheng ;
Sun, Shaoyan ;
Chandraker, Manmohan ;
Yang, Yi ;
Tian, Qi .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3346-3355
[54]   Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks [J].
Zhu, Jun-Yan ;
Park, Taesung ;
Isola, Phillip ;
Efros, Alexei A. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2242-2251