MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared Person Re-Identification

被引:46
作者
Gao, Yajun [1 ]
Liang, Tengfei [1 ]
Jin, Yi [1 ]
Gu, Xiaoyan [2 ]
Liu, Wu [3 ]
Li, Yidong [1 ]
Lang, Congyan [1 ]
机构
[1] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[3] JD AI Res, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年
基金
中国国家自然科学基金;
关键词
Person Re-identification; Cross-Modality; Multi-Feature Space; Joint; Optimization;
D O I
10.1145/3474085.3475643
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The RGB-infrared cross-modality person re-identification (ReID) task aims to recognize the images of the same identity between the visible modality and the infrared modality. Existing methods mainly use a two-stream architecture to eliminate the discrepancy between the two modalities in the final common feature space, which ignore the single space of each modality in the shallowlayers. To solve it, in this paper, we present a novel multi-feature space joint optimization (MSO) network, which can learn modality-sharable features in both the single-modality space and the common space. Firstly, based on the observation that edge information is modality-invariant, we propose an edge features enhancement module to enhance the modality-sharable features in each single-modality space. Specifically, we design a perceptual edge features (PEF) loss after the edge fusion strategy analysis. According to our knowledge, this is the first work that proposes explicit optimization in the singlemodality feature space on cross-modality ReID task. Moreover, to increase the difference between cross-modality distance and class distance, we introduce a novel cross-modality contrastive-center (CMCC) loss into the modality-joint constraints in the common feature space. The PEF loss and CMCC loss jointly optimize the model in an end-to-end manner, which markedly improves the network's performance. Extensive experiments demonstrate that the proposed model significantly outperforms state-of-the-art methods on both the SYSU-MM01 and RegDB datasets.
引用
收藏
页码:5257 / 5265
页数:9
相关论文
共 54 条
[1]  
[Anonymous], 2017, ARXIV161107004
[2]  
[Anonymous], 2020, AAAI 2020
[3]  
Arjovsky M., 2017, ARXIV170107875
[4]   Pose-Guided Tracking-by-Detection: Robust Multi-Person Pose Tracking [J].
Bao, Qian ;
Liu, Wu ;
Cheng, Yuhao ;
Zhou, Boyan ;
Mei, Tao .
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 :161-175
[5]   Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification [J].
Choi, Seokeon ;
Lee, Sumin ;
Kim, Youngeun ;
Kim, Taekyung ;
Kim, Changick .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10254-10263
[6]  
Dai PY, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P677
[7]   Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification [J].
Deng, Weijian ;
Zheng, Liang ;
Ye, Qixiang ;
Kang, Guoliang ;
Yang, Yi ;
Jiao, Jianbin .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :994-1003
[8]   SphereRelD: Deep hypersphere manifold embedding for person re-identification [J].
Fan, Xing ;
Jiang, Wei ;
Luo, Hao ;
Fei, Mengjuan .
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 60 :51-58
[9]   Learning Modality-Specific Representations for Visible-Infrared Person Re-Identification [J].
Feng, Zhanxiang ;
Lai, Jianhuang ;
Xie, Xiaohua .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :579-590
[10]   Self-supervised Moving Vehicle Tracking with Stereo Sound [J].
Gan, Chuang ;
Zhao, Hang ;
Chen, Peihao ;
Cox, David ;
Torralba, Antonio .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :7052-7061