Disentangled body features for clothing change person re-identification

被引：4

作者：

Ding, Yongkang ^{[1
]}

Wu, Yinghao ^{[1
]}

Wang, Anqi ^{[1
]}

Gong, Tiantian ^{[1
]}

Zhang, Liyan ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing 210016, Jiangsu, Peoples R China

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2024年 / 83卷 / 27期

基金：

中国国家自然科学基金;

关键词：

Person re-identification; Clothes-changing scenarios; Vision transformer; Semantic segmentation; Disentangled features;

D O I：

10.1007/s11042-024-18440-4

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the rapid development of computer vision and deep learning technology, person re-identification(ReID) has attracted widespread attention as an important research area. Most current ReID methods primarily focus on short-term re-identification. In the scenario of pedestrian clothing changes, traditional ReID methods face some challenges due to significant changes in pedestrian appearance. Therefore, this paper proposes a clothes-changing person re-identification(CC-ReID) method, namely SViT-ReID, based on a Vision Transformer and incorporating semantic information. This method integrates semantic segmentation maps to more accurately extract features and representations of pedestrian instances in complex scenes, enabling the model to learn some clues unrelated to clothing. Specifically, we extract clothing-unrelated features (such as the face, arms, legs, and feet) from pedestrian parsing tasks' obtained features. These features are then fused with global features to emphasize the importance of these body features. In addition, the complete semantic features derived from pedestrian parsing are fused with global features. These fused features undergo shuffle and grouping operations to generate local features, which are computed in parallel with global features, thereby enhancing the model's robustness and accuracy. Experimental evaluations on two real-world benchmarks show the proposed SViT-ReID achieves state-of-the-art performance. Extensive ablation studies and visualizations illustrate the effectiveness of our method. Our method achieves a Top-1 accuracy of 55.2% and 43.4% on the PRCC and LTCC datasets, respectively.

引用

页码：69693 / 69714

页数：22

共 44 条

[1] Learning 3D Shape Feature for Texture-insensitive Person Re-identification
Chen, Jiaxing
Jiang, Xinyang
Wang, Fudong
Zhang, Jun
Zheng, Feng
Sun, Xing
Zheng, Wei-Shi
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8142 - 8151
[2] Chen W, 2017, P AAAI C ARTIFICIAL, V31
[3] A tutorial on the cross-entropy method
De Boer, PT
Kroese, DP
Mannor, S
Rubinstein, RY
[J]. ANNALS OF OPERATIONS RESEARCH, 2005, 134 (01) : 19 - 67
[4] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[5] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[6] Geng M., 2016, arXiv
[7] Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing
Gong, Ke
Liang, Xiaodan
Zhang, Dongyu
Shen, Xiaohui
Lin, Liang
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6757 - 6765
[8] Clothes-Changing Person Re-identification with RGB Modality Only
Gu, Xinqian
Chang, Hong
Ma, Bingpeng
Bai, Shutao
Shan, Shiguang
Chen, Xilin
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1050 - 1059
[9] He J, 2022, AAAI CONF ARTIF INTE, P852
[10] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778

← 1 2 3 4 5 →