Mask-Guided Region Attention Network for Person Re-Identification

被引:1
作者
Zhou, Cong [1 ]
Yu, Han [1 ,2 ]
机构
[1] Nanjing Univ Posts & Telecommun, Sch Comp Sci & Technol, Nanjing 210023, Peoples R China
[2] Jiangsu Key Lab Big Data Secur & Intelligent Proc, Nanjing 210023, Peoples R China
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT II | 2020年 / 12085卷
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Person re-identification; Human pose estimation; Mask;
D O I
10.1007/978-3-030-47436-2_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Person re-identification (ReID) is an important and practical task which identifies pedestrians across non-overlapping surveillance cameras based on their visual features. In general, ReID is an extremely challenging task due to complex background clutters, large pose variations and severe occlusions. To improve its performance, a robust and discriminative feature extraction methodology is particularly crucial. Recently, the feature alignment technique driven by human pose estimation, that is, matching two person images with their corresponding parts, increases the effectiveness of ReID to a certain extent. However, we argue that there are still a few problems among these methods such as imprecise handcrafted segmentation of body parts, and some improvements can be further achieved. In this paper, we present a novel framework called Mask-Guided Region Attention Network (MGRAN) for person ReID. MGRAN consists of two major components: Mask-guided Region Attention (MRA) and Multi-feature Alignment (MA). MRA aims to generate spatial attention masks and meanwhile mask out the background clutters and occlusions. Moreover, the generated masks are utilized for region-level feature alignment in the MA module. We then evaluate the proposed method on three public datasets, including Market-1501, DukeMTMC-reID and CUHK03. Extensive experiments with ablation analysis show the effectiveness of this method.
引用
收藏
页码:286 / 298
页数:13
相关论文
共 37 条
[1]  
[Anonymous], 2017, In defense of the triplet loss for person re-identification
[2]   Attention to Scale: Scale-aware Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Yang, Yi ;
Wang, Jiang ;
Xu, Wei ;
Yuille, Alan L. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3640-3649
[3]   SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning [J].
Chen, Long ;
Zhang, Hanwang ;
Xiao, Jun ;
Nie, Liqiang ;
Shao, Jian ;
Liu, Wei ;
Chua, Tat-Seng .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6298-6306
[4]   Deep Ranking for Person Re-Identification via Joint Representation Learning [J].
Chen, Shi-Zhe ;
Guo, Chun-Chao ;
Lai, Jian-Huang .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (05) :2353-2367
[5]   Person Re-Identification by Deep Learning Multi-Scale Representations [J].
Chen, Yanbei ;
Zhu, Xiatian ;
Gong, Shaogang .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, :2590-2600
[6]   Person Re-Identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function [J].
Cheng, De ;
Gong, Yihong ;
Zhou, Sanping ;
Wang, Jinjun ;
Zheng, Nanning .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1335-1344
[7]   Multi-Context Attention for Human Pose Estimation [J].
Chu, Xiao ;
Yang, Wei ;
Ouyang, Wanli ;
Ma, Cheng ;
Yuille, Alan L. ;
Wang, Xiaogang .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5669-5678
[8]   Deep feature learning with relative distance comparison for person re-identification [J].
Ding, Shengyong ;
Lin, Liang ;
Wang, Guangrun ;
Chao, Hongyang .
PATTERN RECOGNITION, 2015, 48 (10) :2993-3003
[9]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[10]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778