IAUnet: Global Context-Aware Feature Learning for Person Reidentification

被引:41
作者
Hou, Ruibing [1 ,2 ]
Ma, Bingpeng [2 ]
Chang, Hong [1 ,2 ]
Gu, Xinqian [1 ,2 ]
Shan, Shiguang [1 ,2 ,3 ]
Chen, Xilin [1 ,2 ]
机构
[1] Chinese Acad Sci, Key Lab Intelligent Informat Proc, Inst Comp Technol, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China
[3] CAS Ctr Excellence Brain Sci & Intelligence Techn, Shanghai 200031, Peoples R China
关键词
Context modeling; Feature extraction; Computational modeling; Semantics; Aggregates; Visualization; Task analysis; Feature enhancing; interaction-aggregation; person reidentification (reID); spatial-temporal context modeling; ATTENTION; NETWORK;
D O I
10.1109/TNNLS.2020.3017939
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Person reidentification (reID) by convolutional neural network (CNN)-based networks has achieved favorable performance in recent years. However, most of existing CNN-based methods do not take full advantage of spatial-temporal context modeling. In fact, the global spatial-temporal context can greatly clarify local distractions to enhance the target feature representation. To comprehensively leverage the spatial-temporal context information, in this work, we present a novel block, interaction-aggregation-update (IAU), for high-performance person reID. First, the spatial-temporal IAU (STIAU) module is introduced. STIAU jointly incorporates two types of contextual interactions into a CNN framework for target feature learning. Here, the spatial interactions learn to compute the contextual dependencies between different body parts of a single frame, while the temporal interactions are used to capture the contextual dependencies between the same body parts across all frames. Furthermore, a channel IAU (CIAU) module is designed to model the semantic contextual interactions between channel features to enhance the feature representation, especially for small-scale visual cues and body parts. Therefore, the IAU block enables the feature to incorporate the globally spatial, temporal, and channel context. It is lightweight, end-to-end trainable, and can be easily plugged into existing CNNs to form IAUnet. The experiments show that IAUnet performs favorably against state of the art on both image and video reID tasks and achieves compelling results on a general object categorization task. The source code is available at https://github.com/blue-blue272/ImgReID-IAnet.
引用
收藏
页码:4460 / 4474
页数:15
相关论文
共 87 条
  • [1] [Anonymous], 2017, ARXIV PREPRINT ARXIV
  • [2] Bai S, 2017, AAAI CONF ARTIF INTE, P1281
  • [3] Multi-Level Factorisation Net for Person Re-Identification
    Chang, Xiaobin
    Hospedales, Timothy M.
    Xiang, Tao
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2109 - 2118
  • [4] Video Person Re-identification with Competitive Snippet-similarity Aggregation and Co-attentive Snippet Embedding
    Chen, Dapeng
    Li, Hongsheng
    Xiao, Tong
    Yi, Shuai
    Wang, Xiaogang
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : CP1 - CP99
  • [5] Group Consistent Similarity Learning via Deep CRF for Person Re-Identification
    Chen, Dapeng
    Xu, Dan
    Li, Hongsheng
    Sebe, Nicu
    Wang, Xiaogang
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8649 - 8658
  • [6] Person Re-Identification by Deep Learning Multi-Scale Representations
    Chen, Yanbei
    Zhu, Xiatian
    Gong, Shaogang
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 2590 - 2600
  • [7] Person Re-Identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function
    Cheng, De
    Gong, Yihong
    Zhou, Sanping
    Wang, Jinjun
    Zheng, Nanning
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1335 - 1344
  • [8] Graph Convolutional Tracking
    Gao, Junyu
    Zhang, Tianzhu
    Xu, Changsheng
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4644 - 4654
  • [9] Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing
    Gong, Ke
    Liang, Xiaodan
    Zhang, Dongyu
    Shen, Xiaohui
    Lin, Liang
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6757 - 6765
  • [10] Temporal Knowledge Propagation for Image-to-Video Person Re-identification
    Gu, Xinqian
    Ma, Bingpeng
    Chang, Hong
    Shan, Shiguang
    Chen, Xilin
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9646 - 9655