IAUnet: Global Context-Aware Feature Learning for Person Reidentification

被引：41

作者：

Hou, Ruibing ^{[1
,2
]}

Ma, Bingpeng ^{[2
]}

Chang, Hong ^{[1
,2
]}

Gu, Xinqian ^{[1
,2
]}

Shan, Shiguang ^{[1
,2
,3
]}

Chen, Xilin ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Key Lab Intelligent Informat Proc, Inst Comp Technol, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China

[3] CAS Ctr Excellence Brain Sci & Intelligence Techn, Shanghai 200031, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2021年 / 32卷 / 10期

关键词：

Context modeling; Feature extraction; Computational modeling; Semantics; Aggregates; Visualization; Task analysis; Feature enhancing; interaction-aggregation; person reidentification (reID); spatial-temporal context modeling; ATTENTION; NETWORK;

D O I：

10.1109/TNNLS.2020.3017939

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Person reidentification (reID) by convolutional neural network (CNN)-based networks has achieved favorable performance in recent years. However, most of existing CNN-based methods do not take full advantage of spatial-temporal context modeling. In fact, the global spatial-temporal context can greatly clarify local distractions to enhance the target feature representation. To comprehensively leverage the spatial-temporal context information, in this work, we present a novel block, interaction-aggregation-update (IAU), for high-performance person reID. First, the spatial-temporal IAU (STIAU) module is introduced. STIAU jointly incorporates two types of contextual interactions into a CNN framework for target feature learning. Here, the spatial interactions learn to compute the contextual dependencies between different body parts of a single frame, while the temporal interactions are used to capture the contextual dependencies between the same body parts across all frames. Furthermore, a channel IAU (CIAU) module is designed to model the semantic contextual interactions between channel features to enhance the feature representation, especially for small-scale visual cues and body parts. Therefore, the IAU block enables the feature to incorporate the globally spatial, temporal, and channel context. It is lightweight, end-to-end trainable, and can be easily plugged into existing CNNs to form IAUnet. The experiments show that IAUnet performs favorably against state of the art on both image and video reID tasks and achieves compelling results on a general object categorization task. The source code is available at https://github.com/blue-blue272/ImgReID-IAnet.

引用

页码：4460 / 4474

页数：15

共 87 条

[1] [Anonymous], 2017, ARXIV PREPRINT ARXIV
[2] Bai S, 2017, AAAI CONF ARTIF INTE, P1281
[3] Multi-Level Factorisation Net for Person Re-Identification
Chang, Xiaobin
Hospedales, Timothy M.
Xiang, Tao
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2109 - 2118
[4] Video Person Re-identification with Competitive Snippet-similarity Aggregation and Co-attentive Snippet Embedding
Chen, Dapeng
Li, Hongsheng
Xiao, Tong
Yi, Shuai
Wang, Xiaogang
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : CP1 - CP99
[5] Group Consistent Similarity Learning via Deep CRF for Person Re-Identification
Chen, Dapeng
Xu, Dan
Li, Hongsheng
Sebe, Nicu
Wang, Xiaogang
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8649 - 8658
[6] Person Re-Identification by Deep Learning Multi-Scale Representations
Chen, Yanbei
Zhu, Xiatian
Gong, Shaogang
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 2590 - 2600
[7] Person Re-Identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function
Cheng, De
Gong, Yihong
Zhou, Sanping
Wang, Jinjun
Zheng, Nanning
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1335 - 1344
[8] Graph Convolutional Tracking
Gao, Junyu
Zhang, Tianzhu
Xu, Changsheng
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4644 - 4654
[9] Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing
Gong, Ke
Liang, Xiaodan
Zhang, Dongyu
Shen, Xiaohui
Lin, Liang
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6757 - 6765
[10] Temporal Knowledge Propagation for Image-to-Video Person Re-identification
Gu, Xinqian
Ma, Bingpeng
Chang, Hong
Shan, Shiguang
Chen, Xilin
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9646 - 9655

← 1 2 3 4 5 6 7 8 9 →