SANet: Statistic Attention Network for Video-Based Person Re-Identification

被引:22
作者
Bai, Shutao [1 ,2 ]
Ma, Bingpeng [2 ]
Chang, Hong [1 ,2 ]
Huang, Rui [3 ,4 ]
Shan, Shiguang [1 ,2 ]
Chen, Xilin [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China
[3] Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518172, Guangdong, Peoples R China
[4] Shenzhen Inst Artificial Intelligence & Robot, Shenzhen 518172, Guangdong, Peoples R China
关键词
Feature extraction; Task analysis; Computational modeling; Visualization; Video sequences; Fuses; Computer science; Person re-identification; self-attention; long-range dependencies; high-order statistics;
D O I
10.1109/TCSVT.2021.3119983
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Capturing long-range dependencies during feature extraction is crucial for video-based person re-identification (re-id) since it would help to tackle many challenging problems such as occlusion and dramatic pose variation. Moreover, capturing subtle differences, such as bags and glasses, is indispensable to distinguish similar pedestrians. In this paper, we propose a novel and efficacious Statistic Attention (SA) block which can capture both the long-range dependencies and subtle differences. SA block leverages high-order statistics of feature maps, which contain both long-range and high-order information. By modeling relations with these statistics, SA block can explicitly capture long-range dependencies with less time complexity. In addition, high-order statistics usually concentrate on details of feature maps and can perceive the subtle differences between pedestrians. In this way, SA block is capable of discriminating pedestrians with subtle differences. Furthermore, this lightweight block can be conveniently inserted into existing deep neural networks at any depth to form Statistic Attention Network (SANet). To evaluate its performance, we conduct extensive experiments on two challenging video re-id datasets, showing that our SANet outperforms the state-of-the-art methods. Furthermore, to show the generalizability of SANet, we evaluate it on three image re-id datasets and two more general image classification datasets, including ImageNet. The source code is available at http://vipl.ict.ac.cn/resources/codes/code/SANet_code.zip.
引用
收藏
页码:3866 / 3879
页数:14
相关论文
共 81 条
[71]   Attribute-Driven Feature Disentangling and Temporal Aggregation for Video Person Re-Identification [J].
Zhao, Yiru ;
Shen, Xu ;
Jin, Zhongming ;
Lu, Hongtao ;
Hua, Xian-sheng .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4908-4917
[72]   Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training [J].
Zheng, Feng ;
Deng, Cheng ;
Sun, Xing ;
Jiang, Xinyang ;
Guo, Xiaowei ;
Yu, Zongqiao ;
Huang, Feiyue ;
Ji, Rongrong .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8506-8514
[73]   Pose-Invariant Embedding for Deep Person Re-Identification [J].
Zheng, Liang ;
Huang, Yujia ;
Lu, Huchuan ;
Yang, Yi .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (09) :4500-4509
[74]   MARS: A Video Benchmark for Large-Scale Person Re-Identification [J].
Zheng, Liang ;
Bie, Zhi ;
Sun, Yifan ;
Wang, Jingdong ;
Su, Chi ;
Wang, Shengjin ;
Tian, Qi .
COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 :868-884
[75]   Scalable Person Re-identification: A Benchmark [J].
Zheng, Liang ;
Shen, Liyue ;
Tian, Lu ;
Wang, Shengjin ;
Wang, Jingdong ;
Tian, Qi .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1116-1124
[76]   Re-Identification with Consistent Attentive Siamese Networks [J].
Zheng, Meng ;
Karanam, Srikrishna ;
Wu, Ziyan ;
Radke, Richard J. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5728-5737
[77]   Joint Discriminative and Generative Learning for Person Re-identification [J].
Zheng, Zhedong ;
Yang, Xiaodong ;
Yu, Zhiding ;
Zheng, Liang ;
Yang, Yi ;
Kautz, Jan .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2133-2142
[78]   Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro [J].
Zheng, Zhedong ;
Zheng, Liang ;
Yang, Yi .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3774-3782
[79]   Re-ranking Person Re-identification with k-reciprocal Encoding [J].
Zhong, Zhun ;
Zheng, Liang ;
Cao, Donglin ;
Li, Shaozi .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3652-3661
[80]   Omni-Scale Feature Learning for Person Re-Identification [J].
Zhou, Kaiyang ;
Yang, Yongxin ;
Cavallaro, Andrea ;
Xiang, Tao .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3701-3711