A Two-Stage Attribute-Constraint Network for Video-Based Person Re-Identification

被引：10

作者：

Song, Wanru ^{[2
]}

Zheng, Jieying ^{[2
]}

Wu, Yahong ^{[2
]}

Chen, Changhong ^{[2
]}

Liu, Feng ^{[1
,2
]}

机构：

[1] Nanjing Univ Posts & Telecommun, Jiangsu Key Lab Image Proc & Image Commun, Nanjing 210003, Jiangsu, Peoples R China

[2] Nanjing Univ Posts & Telecommun, Key Lab Broadband Wireless Commun & Sensor Networ, Minist Educ, Nanjing 210003, Jiangsu, Peoples R China

来源：

IEEE ACCESS | 2019年 / 7卷

基金：

中国国家自然科学基金;

关键词：

Attribute; constraint; feature extraction; person re-identification; video;

D O I：

10.1109/ACCESS.2019.2890836

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Person re-identification has gradually become a popular research topic in many fields such as security, criminal investigation, and video analysis. This paper aims to learn a discriminative and robust spatial-temporal representation for video-based person re-identification by a two-stage attribute-constraint network (TSAC-Net). The knowledge of pedestrian attributes can help re-identification tasks because it contains high-level information and is robust to visual variations. In this paper, we manually annotate three video-based person re-identification datasets with four static appearance attributes and one dynamic appearance attribute. Each attribute is regarded as a constraint that is added to the deep network. In the first stage of the TSAC-Net, we solve the re-identification problem as a classification issue and adopt a multi-attribute classification loss to train the CNN model. In the second stage, two LSTM networks are trained under the constraint of identities and dynamic appearance attributes. Therefore, the two-stage network provides a spatial-temporal feature extractor for pedestrians in video sequences. In the testing phase, a spatial-temporal representation can be obtained by inputting a sequence of images to the proposed TSAC-Net. We demonstrate the performance improvement gained with the use of attributes on several challenging person re-identification datasets (PRID2011, iLIDS-VID, MARS, and VIPeR). Moreover, the extensive experiments show that our approach achieves state-of-the-art results on three video-based benchmark datasets.

引用

页码：8508 / 8518

页数：11

共 38 条

[1] Multi-directional saliency metric learning for person re-identification
Chen, Ying
Huo, Zhonghua
Hua, Chunjian
[J]. IET COMPUTER VISION, 2016, 10 (07) : 623 - 633
[2] Video Person Re-Identification by Temporal Residual Learning
Dai, Ju
Zhang, Pingping
Wang, Dong
Lu, Huchuan
Wang, Hongyu
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (03) : 1366 - 1377
[3] Pedestrian Attribute Recognition At Far Distance
Deng, Yubin
Luo, Ping
Loy, Chen Change
Tang, Xiaoou
[J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 789 - 792
[4] Deep feature learning with relative distance comparison for person re-identification
Ding, Shengyong
Lin, Liang
Wang, Guangrun
Chao, Hongyang
[J]. PATTERN RECOGNITION, 2015, 48 (10) : 2993 - 3003
[5] Gheissari N., 2006, IEEE COMP SOC C COMP, V2, P1528, DOI DOI 10.1109/CVPR.2006.223
[6] Gong SG, 2014, ADV COMPUT VIS PATT, P1, DOI 10.1007/978-1-4471-6296-4_1
[7] Gray D., 2007, IEEE Int. Worksh. Perf. Eval. Trk. Surv., P1
[8] Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features
Gray, Douglas
Tao, Hai
[J]. COMPUTER VISION - ECCV 2008, PT I, PROCEEDINGS, 2008, 5302 : 262 - 275
[9] Person Re-Identification by Weighted Integration of Sparse and Collaborative Representation
Guo, Jie
Zhang, Yuele
Huang, Zheng
Qiu, Weidong
[J]. IEEE ACCESS, 2017, 5 : 21632 - 21639
[10] Hirzer M, 2011, LECT NOTES COMPUT SC, V6688, P91, DOI 10.1007/978-3-642-21227-7_9

← 1 2 3 4 →