POSNet: a hybrid deep learning model for efficient person re-identification

被引:3
作者
Batool, Eliza [1 ]
Gillani, Saira [2 ]
Naz, Sheneela [3 ]
Bukhari, Maryam [1 ]
Maqsood, Muazzam [1 ]
Yeo, Sang-Soo [4 ]
Rho, Seungmin [5 ]
机构
[1] COMSATS Univ Islamabad, Dept Comp Sci, Attock Campus, Attock, Pakistan
[2] Univ Cent Punjab, Fac Informat Technol & Comp Sci, Lahore, Pakistan
[3] COMSATS Univ Islamabad, Dept Comp Sci, Islamabad, Pakistan
[4] Mokwon Univ, Dept Comp Engn, Daejeon 35349, South Korea
[5] Chung Ang Univ, Dept Ind Secur, Seoul 06974, South Korea
基金
新加坡国家研究基金会;
关键词
Person re-identification; POSNet; Hybrid deep learning; Spatio-temporal feature learning; Limited labeled data challenges; Intra- and inter-class variations; Soft-pool-assisted attentions; LSTM; RECOGNITION; NETWORK;
D O I
10.1007/s11227-023-05169-4
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Person re-identification refers to the process of recognizing a person across several non-overlapping cameras. It is becoming increasingly important in computer vision for real-world surveillance applications. However, the deployment of person-re-identification systems as a surveillance system raises various challenges in their performance. These challenges include limited labeled data, occlusions conditions, human body postures, as well as inter- and intra-class variations. Such challenges deteriorate the effectiveness of person-re-identification systems and lead to the extraction of less discriminative features. Hence, to address these problems, we proposed a hybrid deep learning model, namely POSNet (pseudo-labeled omni-scale network) for efficient person re-identification. The proposed method is referred to as a hybrid because it combines label estimate with modified omni-scale feature learning, i.e., spatiotemporal-assisted omni-scale feature extraction to accomplish person re-identification. To further enhance omni-scale feature learning, we have proposed soft-pool-assisted attention mechanisms during spatial learning. More precisely, soft-pool preserves more important features, and that features are further emphasized by spatial and channel attention layers. Following on, this omni-scale with soft-pool attention learning extracts the spatial information from all frames of videos, and later on, the temporal learning is incorporated using the LSTM model. To handle limited labeled data problems, the proposed hybrid model first assigns pseudo-labels to the unlabeled data and adopts a progressive learning strategy to retrain the model on both labeled and unlabeled data with improved feature extraction, i.e., modified omni-scale feature learning. Moreover, the proposed POSNet model is validated on two large video-based person re-identification datasets, namely MARS and DukeMTMC-Video. It is observed from the research findings that the proposed POSNet outperformed the existing studies with the highest mAP and rank@1 score of 83.7 and 90.3%, respectively.
引用
收藏
页码:13090 / 13118
页数:29
相关论文
共 96 条
  • [61] Teng H, 2020, ARXIV
  • [62] Robust joint learning network: improved deep representation learning for person re-identification
    Tian, Yumin
    Li, Qiang
    Wang, Di
    Wan, Bo
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (17) : 24187 - 24203
  • [63] Soft Biometrics and Their Application in Person Recognition at a Distance
    Tome, Pedro
    Fierrez, Julian
    Vera-Rodriguez, Ruben
    Nixon, Mark S.
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2014, 9 (03) : 464 - 475
  • [64] Extensive Comparison of Visual Features for Person Re-identification
    Wang, Guanzhong
    Fang, Yikai
    Wang, Jinqiao
    Sun, Jian
    [J]. 8TH INTERNATIONAL CONFERENCE ON INTERNET MULTIMEDIA COMPUTING AND SERVICE (ICIMCS2016), 2016, : 192 - 196
  • [65] Multi-level feature fusion model-based real-time person re-identification for forensics
    Wang, Shiqin
    Xu, Xin
    Liu, Lei
    Tian, Jing
    [J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2020, 17 (01) : 73 - 81
  • [66] Wang TQ, 2014, LECT NOTES COMPUT SC, V8692, P688, DOI 10.1007/978-3-319-10593-2_45
  • [67] Surpassing Real-World Source Training Data: Random 3D Characters for Generalizable Person Re-Identification
    Wang, Yanan
    Liao, Shengcai
    Shao, Ling
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3422 - 3430
  • [68] Person Transfer GAN to Bridge Domain Gap for Person Re-Identification
    Wei, Longhui
    Zhang, Shiliang
    Gao, Wen
    Tian, Qi
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 79 - 88
  • [69] GLAD: Global-Local-Alignment Descriptor for Pedestrian Retrieval
    Wei, Longhui
    Zhang, Shiliang
    Yao, Hantao
    Gao, Wen
    Tian, Qi
    [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 420 - 428
  • [70] Deep learning-based methods for person re-identification: A comprehensive review
    Wu, Di
    Zheng, Si-Jia
    Zhang, Xiao-Ping
    Yuan, Chang-An
    Cheng, Fei
    Zhao, Yang
    Lin, Yong-Jun
    Zhao, Zhong-Qiu
    Jiang, Yong-Li
    Huang, De-Shuang
    [J]. NEUROCOMPUTING, 2019, 337 : 354 - 371