Disentangled body features for clothing change person re-identification

被引:4
作者
Ding, Yongkang [1 ]
Wu, Yinghao [1 ]
Wang, Anqi [1 ]
Gong, Tiantian [1 ]
Zhang, Liyan [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing 210016, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Person re-identification; Clothes-changing scenarios; Vision transformer; Semantic segmentation; Disentangled features;
D O I
10.1007/s11042-024-18440-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid development of computer vision and deep learning technology, person re-identification(ReID) has attracted widespread attention as an important research area. Most current ReID methods primarily focus on short-term re-identification. In the scenario of pedestrian clothing changes, traditional ReID methods face some challenges due to significant changes in pedestrian appearance. Therefore, this paper proposes a clothes-changing person re-identification(CC-ReID) method, namely SViT-ReID, based on a Vision Transformer and incorporating semantic information. This method integrates semantic segmentation maps to more accurately extract features and representations of pedestrian instances in complex scenes, enabling the model to learn some clues unrelated to clothing. Specifically, we extract clothing-unrelated features (such as the face, arms, legs, and feet) from pedestrian parsing tasks' obtained features. These features are then fused with global features to emphasize the importance of these body features. In addition, the complete semantic features derived from pedestrian parsing are fused with global features. These fused features undergo shuffle and grouping operations to generate local features, which are computed in parallel with global features, thereby enhancing the model's robustness and accuracy. Experimental evaluations on two real-world benchmarks show the proposed SViT-ReID achieves state-of-the-art performance. Extensive ablation studies and visualizations illustrate the effectiveness of our method. Our method achieves a Top-1 accuracy of 55.2% and 43.4% on the PRCC and LTCC datasets, respectively.
引用
收藏
页码:69693 / 69714
页数:22
相关论文
共 44 条
  • [1] Learning 3D Shape Feature for Texture-insensitive Person Re-identification
    Chen, Jiaxing
    Jiang, Xinyang
    Wang, Fudong
    Zhang, Jun
    Zheng, Feng
    Sun, Xing
    Zheng, Wei-Shi
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8142 - 8151
  • [2] Chen W, 2017, P AAAI C ARTIFICIAL, V31
  • [3] A tutorial on the cross-entropy method
    De Boer, PT
    Kroese, DP
    Mannor, S
    Rubinstein, RY
    [J]. ANNALS OF OPERATIONS RESEARCH, 2005, 134 (01) : 19 - 67
  • [4] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [5] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
  • [6] Geng M., 2016, arXiv
  • [7] Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing
    Gong, Ke
    Liang, Xiaodan
    Zhang, Dongyu
    Shen, Xiaohui
    Lin, Liang
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6757 - 6765
  • [8] Clothes-Changing Person Re-identification with RGB Modality Only
    Gu, Xinqian
    Chang, Hong
    Ma, Bingpeng
    Bai, Shutao
    Shan, Shiguang
    Chen, Xilin
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1050 - 1059
  • [9] He J, 2022, AAAI CONF ARTIF INTE, P852
  • [10] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778