A Two-Stage Attribute-Constraint Network for Video-Based Person Re-Identification

被引:10
作者
Song, Wanru [2 ]
Zheng, Jieying [2 ]
Wu, Yahong [2 ]
Chen, Changhong [2 ]
Liu, Feng [1 ,2 ]
机构
[1] Nanjing Univ Posts & Telecommun, Jiangsu Key Lab Image Proc & Image Commun, Nanjing 210003, Jiangsu, Peoples R China
[2] Nanjing Univ Posts & Telecommun, Key Lab Broadband Wireless Commun & Sensor Networ, Minist Educ, Nanjing 210003, Jiangsu, Peoples R China
来源
IEEE ACCESS | 2019年 / 7卷
基金
中国国家自然科学基金;
关键词
Attribute; constraint; feature extraction; person re-identification; video;
D O I
10.1109/ACCESS.2019.2890836
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Person re-identification has gradually become a popular research topic in many fields such as security, criminal investigation, and video analysis. This paper aims to learn a discriminative and robust spatial-temporal representation for video-based person re-identification by a two-stage attribute-constraint network (TSAC-Net). The knowledge of pedestrian attributes can help re-identification tasks because it contains high-level information and is robust to visual variations. In this paper, we manually annotate three video-based person re-identification datasets with four static appearance attributes and one dynamic appearance attribute. Each attribute is regarded as a constraint that is added to the deep network. In the first stage of the TSAC-Net, we solve the re-identification problem as a classification issue and adopt a multi-attribute classification loss to train the CNN model. In the second stage, two LSTM networks are trained under the constraint of identities and dynamic appearance attributes. Therefore, the two-stage network provides a spatial-temporal feature extractor for pedestrians in video sequences. In the testing phase, a spatial-temporal representation can be obtained by inputting a sequence of images to the proposed TSAC-Net. We demonstrate the performance improvement gained with the use of attributes on several challenging person re-identification datasets (PRID2011, iLIDS-VID, MARS, and VIPeR). Moreover, the extensive experiments show that our approach achieves state-of-the-art results on three video-based benchmark datasets.
引用
收藏
页码:8508 / 8518
页数:11
相关论文
共 38 条
  • [1] Multi-directional saliency metric learning for person re-identification
    Chen, Ying
    Huo, Zhonghua
    Hua, Chunjian
    [J]. IET COMPUTER VISION, 2016, 10 (07) : 623 - 633
  • [2] Video Person Re-Identification by Temporal Residual Learning
    Dai, Ju
    Zhang, Pingping
    Wang, Dong
    Lu, Huchuan
    Wang, Hongyu
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (03) : 1366 - 1377
  • [3] Pedestrian Attribute Recognition At Far Distance
    Deng, Yubin
    Luo, Ping
    Loy, Chen Change
    Tang, Xiaoou
    [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 789 - 792
  • [4] Deep feature learning with relative distance comparison for person re-identification
    Ding, Shengyong
    Lin, Liang
    Wang, Guangrun
    Chao, Hongyang
    [J]. PATTERN RECOGNITION, 2015, 48 (10) : 2993 - 3003
  • [5] Gheissari N., 2006, IEEE COMP SOC C COMP, V2, P1528, DOI DOI 10.1109/CVPR.2006.223
  • [6] Gong SG, 2014, ADV COMPUT VIS PATT, P1, DOI 10.1007/978-1-4471-6296-4_1
  • [7] Gray D., 2007, IEEE Int. Worksh. Perf. Eval. Trk. Surv., P1
  • [8] Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features
    Gray, Douglas
    Tao, Hai
    [J]. COMPUTER VISION - ECCV 2008, PT I, PROCEEDINGS, 2008, 5302 : 262 - 275
  • [9] Person Re-Identification by Weighted Integration of Sparse and Collaborative Representation
    Guo, Jie
    Zhang, Yuele
    Huang, Zheng
    Qiu, Weidong
    [J]. IEEE ACCESS, 2017, 5 : 21632 - 21639
  • [10] Hirzer M, 2011, LECT NOTES COMPUT SC, V6688, P91, DOI 10.1007/978-3-642-21227-7_9