Person Retrieval in Surveillance Videos Via Deep Attribute Mining and Reasoning

被引:38
|
作者
Shi, Yuxuan [1 ]
Wei, Zhen [2 ]
Ling, Hefei [1 ]
Wang, Ziyang [1 ]
Shen, Jialie [3 ]
Li, Ping [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, 1037 Luoyu Rd, Wuhan 430074, Peoples R China
[2] Ecole Polytech Fed Lausanne, Sch Comp & Commun Sci, CH-1015 Lausanne, Switzerland
[3] Queens Univ Belfast, Belfast BT7 1NN, Antrim, North Ireland
关键词
Cognition; Feature extraction; Hair; Semantics; Training; Robustness; Convolution; Person retrieval; person re-identification; human attribute; graph convolutional network; NEURAL-NETWORK; REIDENTIFICATION; IDENTIFICATION;
D O I
10.1109/TMM.2020.3042068
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Person retrieval largely relies on the appearance features of pedestrians. This task is rather more difficult in surveillance videos due to the limitations of extracting robust appearance features brought by the cross-view and cross-camera data with lower image resolution, motion blur, occlusion and other kinds of image degradation. To build up a more reliable person retrieval system, recent works introduced appearance attribute models to describe and distinguish different persons with high-level semantic concepts. Despite the progress of previous works, the value of utilizing appearance attributes is still under-explored. On one hand, existing methods lack for concise and precise attribute representations that are specific for each attribute category and, in the meantime, are able to filter noisy information in irrelevant spatial locations and useless patterns. On the other hand, correlation and reasoning between different attributes are neglected, which could generate more useful information and add more robustness to the retrieval system. In this paper, we propose an Attribute Mining and Reasoning (AMR) framework which is capable to handle the issues in question. The AMR makes better use of appearance attributes with two main components. First, the AMR disentangles the representations of different attributes by localizing their spatial positions and identifying their effective patterns in a weakly supervised manner. To achieve more reliable localization, we propose the Attribute Localization Ensemble (ALE) module that is consisted of multiple localization heads and a voting mechanism. Second, we introduce the Attribute Reasoning (AR) module to correlate different attributes together with the global appearance features and discover their latent relations to generate more comprehensive descriptions of pedestrians. Extensive experiments on DukeMTMC-ReID and Market-1501 datasets demonstrate the effectiveness of the proposed AMR framework as well as its superiority over the existing state-of-the-art methods. The AMR model also shows great generalization ability on the unseen CUHK03 dataset when it is only trained on Market-1501 dataset.
引用
收藏
页码:4376 / 4387
页数:12
相关论文
共 50 条
  • [41] Hair Attribute Transfer via Deep Feature Fusion
    Xie Z.
    Su X.
    Liu S.
    Zhang G.
    Ma L.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2021, 33 (05): : 772 - 779
  • [42] Automatic Knowledge Discovery in Lecturing Videos via Deep Representation
    Lin, Jinjiao
    Liu, Chunfang
    Li, Yibin
    Cui, Lizhen
    Wang, Rui
    Lu, Xudong
    Zhang, Yan
    Lian, Jian
    IEEE ACCESS, 2019, 7 : 33957 - 33963
  • [43] Person re-identification through face detection from videos using Deep Learning
    Mathew, Vimala
    Toby, Tom
    Chacko, Anu
    Udhayakumar, A.
    13TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED NETWORKS AND TELECOMMUNICATION SYSTEMS (IEEE ANTS), 2019,
  • [44] Learning deep part-aware emb e dding for person retrieval
    Zhao, Yang
    Shen, Chunhua
    Yu, Xiaohan
    Chen, Hao
    Gao, Yongsheng
    Xiong, Shengwu
    PATTERN RECOGNITION, 2021, 116
  • [45] IMPROVING ATTRIBUTE-BASED PERSON RETRIEVAL BY USING A CALIBRATED, WEIGHTED, AND DISTRIBUTION-BASED DISTANCE METRIC
    Specker, Andreas
    Beyerer, Juergen
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2378 - 2382
  • [46] Deep and low-level feature based attribute learning for person re-identification
    Chen, Yiqiang
    Duffner, Stefan
    Stoian, Andrei
    Dufour, Jean-Yves
    Baskurt, Atilla
    IMAGE AND VISION COMPUTING, 2018, 79 : 25 - 34
  • [47] Social Context-aware Person Search in Videos via Multi-modal Cues
    Li, Dan
    Xu, Tong
    Zhou, Peilun
    He, Weidong
    Hao, Yanbin
    Zheng, Yi
    Chen, Enhong
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2022, 40 (03)
  • [48] V2AnomalyVec: Deep Discriminative Embeddings for Detecting Anomalous Activities in Surveillance Videos
    Chandrakala, S.
    Vignesh, L. K. P.
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2022, 9 (05): : 1307 - 1316
  • [49] Deep Self-Supervised Hashing With Fine-Grained Similarity Mining for Cross-Modal Retrieval
    Han, Lijun
    Wang, Renlin
    Chen, Chunlei
    Zhang, Huihui
    Zhang, Yujie
    Zhang, Wenfeng
    IEEE ACCESS, 2024, 12 : 31756 - 31770
  • [50] Energy-Based Periodicity Mining With Deep Features for Action Repetition Counting in Unconstrained Videos
    Yin, Jianqin
    Wu, Yanchun
    Zhu, Chaoran
    Yin, Zijin
    Liu, Huaping
    Dang, Yonghao
    Liu, Zhiyi
    Liu, Jun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (12) : 4812 - 4825