POSNet: a hybrid deep learning model for efficient person re-identification

被引:3
作者
Batool, Eliza [1 ]
Gillani, Saira [2 ]
Naz, Sheneela [3 ]
Bukhari, Maryam [1 ]
Maqsood, Muazzam [1 ]
Yeo, Sang-Soo [4 ]
Rho, Seungmin [5 ]
机构
[1] COMSATS Univ Islamabad, Dept Comp Sci, Attock Campus, Attock, Pakistan
[2] Univ Cent Punjab, Fac Informat Technol & Comp Sci, Lahore, Pakistan
[3] COMSATS Univ Islamabad, Dept Comp Sci, Islamabad, Pakistan
[4] Mokwon Univ, Dept Comp Engn, Daejeon 35349, South Korea
[5] Chung Ang Univ, Dept Ind Secur, Seoul 06974, South Korea
基金
新加坡国家研究基金会;
关键词
Person re-identification; POSNet; Hybrid deep learning; Spatio-temporal feature learning; Limited labeled data challenges; Intra- and inter-class variations; Soft-pool-assisted attentions; LSTM; RECOGNITION; NETWORK;
D O I
10.1007/s11227-023-05169-4
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Person re-identification refers to the process of recognizing a person across several non-overlapping cameras. It is becoming increasingly important in computer vision for real-world surveillance applications. However, the deployment of person-re-identification systems as a surveillance system raises various challenges in their performance. These challenges include limited labeled data, occlusions conditions, human body postures, as well as inter- and intra-class variations. Such challenges deteriorate the effectiveness of person-re-identification systems and lead to the extraction of less discriminative features. Hence, to address these problems, we proposed a hybrid deep learning model, namely POSNet (pseudo-labeled omni-scale network) for efficient person re-identification. The proposed method is referred to as a hybrid because it combines label estimate with modified omni-scale feature learning, i.e., spatiotemporal-assisted omni-scale feature extraction to accomplish person re-identification. To further enhance omni-scale feature learning, we have proposed soft-pool-assisted attention mechanisms during spatial learning. More precisely, soft-pool preserves more important features, and that features are further emphasized by spatial and channel attention layers. Following on, this omni-scale with soft-pool attention learning extracts the spatial information from all frames of videos, and later on, the temporal learning is incorporated using the LSTM model. To handle limited labeled data problems, the proposed hybrid model first assigns pseudo-labels to the unlabeled data and adopts a progressive learning strategy to retrain the model on both labeled and unlabeled data with improved feature extraction, i.e., modified omni-scale feature learning. Moreover, the proposed POSNet model is validated on two large video-based person re-identification datasets, namely MARS and DukeMTMC-Video. It is observed from the research findings that the proposed POSNet outperformed the existing studies with the highest mAP and rank@1 score of 83.7 and 90.3%, respectively.
引用
收藏
页码:13090 / 13118
页数:29
相关论文
共 96 条
  • [1] Ahmed E, 2015, PROC CVPR IEEE, P3908, DOI 10.1109/CVPR.2015.7299016
  • [2] A Survey on Deep Learning-Based Person Re-Identification Systems
    Almasawa, Muna O.
    Elrefaei, Lamiaa A.
    Moria, Kawthar
    [J]. IEEE ACCESS, 2019, 7 : 175228 - 175247
  • [3] [Anonymous], 2015, PROC CVPR IEEE
  • [4] Region-of-Interest Based Transfer Learning Assisted Framework for Skin Cancer Detection
    Ashraf, Rehan
    Afzal, Sitara
    Rehman, Attiq Ur
    Gul, Sarah
    Baber, Junaid
    Bakhtyar, Maheen
    Mehmood, Irfan
    Song, Oh-Young
    Maqsood, Muazzam
    [J]. IEEE ACCESS, 2020, 8 : 147858 - 147871
  • [5] Bak Slawomir, 2010, Proceedings 7th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS 2010), P435, DOI 10.1109/AVSS.2010.34
  • [6] One-Shot Metric Learning for Person Re-identification
    Bak, Slawomir
    Carr, Peter
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1571 - 1580
  • [7] Bialkowski A, 2012, 2012 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING TECHNIQUES AND APPLICATIONS (DICTA)
  • [8] An Efficient Gait Recognition Method for Known and Unknown Covariate Conditions
    Bukhari, Maryam
    Bajwa, Khalid Bashir
    Gillani, Saira
    Maqsood, Muazzam
    Durrani, Mehr Yahya
    Mehmood, Irfan
    Ugail, Hassan
    Rho, Seungmin
    [J]. IEEE ACCESS, 2021, 9 : 6465 - 6477
  • [9] Person Re-Identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function
    Cheng, De
    Gong, Yihong
    Zhou, Sanping
    Wang, Jinjun
    Zheng, Nanning
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1335 - 1344
  • [10] Custom Pictorial Structures for Re-identification
    Cheng, Dong Seon
    Cristani, Marco
    Stoppa, Michele
    Bazzani, Loris
    Murino, Vittorio
    [J]. PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,