Parameter Sharing Exploration and Hetero-Center Triplet Loss for Visible-Thermal Person Re-Identification

被引：167

作者：

Liu, Haijun ^{[1
]}

Tan, Xiaoheng ^{[1
]}

Zhou, Xichuan ^{[1
]}

机构：

[1] Chongqing Univ, Sch Microelectron & Communt Engn, Chongqing 400044, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2021年 / 23卷

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

Feature extraction; Cameras; Training; Task analysis; Measurement; Generative adversarial networks; Loss measurement; Cross-modality discrepancy; hetero-center triplet loss; parameters sharing; visible-thermal person re-identification;

D O I：

10.1109/TMM.2020.3042080

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper focuses on the visible-thermal cross-modality person re-identification (VT Re-ID) task, whose goal is to match person images between the daytime visible modality and the nighttime thermal modality. The two-stream network is usually adopted to address the cross-modality discrepancy, the most challenging problem for VT Re-ID, by learning the multi-modality person features. In this paper, we explore how many parameters a two-stream network should share, which is still not well investigated in the existing literature. By splitting the ResNet50 model to construct the modality-specific feature extraction network and modality-sharing feature embedding network, we experimentally demonstrate the effect of parameter sharing of two-stream network for VT Re-ID. Moreover, in the framework of part-level person feature learning, we propose the hetero-center triplet loss to relax the strict constraint of traditional triplet loss by replacing the comparison of the anchor to all the other samples by the anchor center to all the other centers. With extremely simple means, the proposed method can significantly improve the VT Re-ID performance. The experimental results on two datasets show that our proposed method distinctly outperforms the state-of-the-art methods by large margins, especially on the RegDB dataset achieving superior performance, rank1/mAP/mINP 91.05%/83.28%/68.84%. It can be a new baseline for VT Re-ID, with a simple but effective strategy.

引用

页码：4414 / 4425

页数：12

共 39 条

[1]

[Anonymous], IN PRESS, DOI DOI 10.1109/TMM.2020.2999180

[2]

[Anonymous], 2019, P 27 ACM INT C MULT

[3]

Dai PY, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P677

[4] Learning Modality-Specific Representations for Visible-Infrared Person Re-Identification [J].

Feng, Zhanxiang ;

Lai, Jianhuang ;

Xie, Xiaohua .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :579-590

[5]

Hao Y, 2019, AAAI CONF ARTIF INTE, P8385

[6] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[7]

Hermans Alexander, 2017, arXiv

[8]

Jiang W, 2020, ARXIV200300213

[9] Person Re-Identification Between Visible and Thermal Camera Images Based on Deep Residual CNN Using Single Input [J].

Kang, Jin Kyu ;

Hoang, Toan Minh ;

Park, Kang Ryoung .

IEEE ACCESS, 2019, 7 :57972-57984

[10]

Kim T., 2020, P C COMP VIS PATT, p10 257

← 1 2 3 4 →