POSNet: a hybrid deep learning model for efficient person re-identification

被引：3

作者：

Batool, Eliza ^{[1
]}

Gillani, Saira ^{[2
]}

Naz, Sheneela ^{[3
]}

Bukhari, Maryam ^{[1
]}

Maqsood, Muazzam ^{[1
]}

Yeo, Sang-Soo ^{[4
]}

Rho, Seungmin ^{[5
]}

机构：

[1] COMSATS Univ Islamabad, Dept Comp Sci, Attock Campus, Attock, Pakistan

[2] Univ Cent Punjab, Fac Informat Technol & Comp Sci, Lahore, Pakistan

[3] COMSATS Univ Islamabad, Dept Comp Sci, Islamabad, Pakistan

[4] Mokwon Univ, Dept Comp Engn, Daejeon 35349, South Korea

[5] Chung Ang Univ, Dept Ind Secur, Seoul 06974, South Korea

来源：

JOURNAL OF SUPERCOMPUTING | 2023年 / 79卷 / 12期

基金：

新加坡国家研究基金会;

关键词：

Person re-identification; POSNet; Hybrid deep learning; Spatio-temporal feature learning; Limited labeled data challenges; Intra- and inter-class variations; Soft-pool-assisted attentions; LSTM; RECOGNITION; NETWORK;

D O I：

10.1007/s11227-023-05169-4

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Person re-identification refers to the process of recognizing a person across several non-overlapping cameras. It is becoming increasingly important in computer vision for real-world surveillance applications. However, the deployment of person-re-identification systems as a surveillance system raises various challenges in their performance. These challenges include limited labeled data, occlusions conditions, human body postures, as well as inter- and intra-class variations. Such challenges deteriorate the effectiveness of person-re-identification systems and lead to the extraction of less discriminative features. Hence, to address these problems, we proposed a hybrid deep learning model, namely POSNet (pseudo-labeled omni-scale network) for efficient person re-identification. The proposed method is referred to as a hybrid because it combines label estimate with modified omni-scale feature learning, i.e., spatiotemporal-assisted omni-scale feature extraction to accomplish person re-identification. To further enhance omni-scale feature learning, we have proposed soft-pool-assisted attention mechanisms during spatial learning. More precisely, soft-pool preserves more important features, and that features are further emphasized by spatial and channel attention layers. Following on, this omni-scale with soft-pool attention learning extracts the spatial information from all frames of videos, and later on, the temporal learning is incorporated using the LSTM model. To handle limited labeled data problems, the proposed hybrid model first assigns pseudo-labels to the unlabeled data and adopts a progressive learning strategy to retrain the model on both labeled and unlabeled data with improved feature extraction, i.e., modified omni-scale feature learning. Moreover, the proposed POSNet model is validated on two large video-based person re-identification datasets, namely MARS and DukeMTMC-Video. It is observed from the research findings that the proposed POSNet outperformed the existing studies with the highest mAP and rank@1 score of 83.7 and 90.3%, respectively.

引用

页码：13090 / 13118

页数：29

共 96 条

[61] Teng H, 2020, ARXIV
[62] Robust joint learning network: improved deep representation learning for person re-identification
Tian, Yumin
Li, Qiang
Wang, Di
Wan, Bo
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (17) : 24187 - 24203
[63] Soft Biometrics and Their Application in Person Recognition at a Distance
Tome, Pedro
Fierrez, Julian
Vera-Rodriguez, Ruben
Nixon, Mark S.
[J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2014, 9 (03) : 464 - 475
[64] Extensive Comparison of Visual Features for Person Re-identification
Wang, Guanzhong
Fang, Yikai
Wang, Jinqiao
Sun, Jian
[J]. 8TH INTERNATIONAL CONFERENCE ON INTERNET MULTIMEDIA COMPUTING AND SERVICE (ICIMCS2016), 2016, : 192 - 196
[65] Multi-level feature fusion model-based real-time person re-identification for forensics
Wang, Shiqin
Xu, Xin
Liu, Lei
Tian, Jing
[J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2020, 17 (01) : 73 - 81
[66] Wang TQ, 2014, LECT NOTES COMPUT SC, V8692, P688, DOI 10.1007/978-3-319-10593-2_45
[67] Surpassing Real-World Source Training Data: Random 3D Characters for Generalizable Person Re-Identification
Wang, Yanan
Liao, Shengcai
Shao, Ling
[J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3422 - 3430
[68] Person Transfer GAN to Bridge Domain Gap for Person Re-Identification
Wei, Longhui
Zhang, Shiliang
Gao, Wen
Tian, Qi
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 79 - 88
[69] GLAD: Global-Local-Alignment Descriptor for Pedestrian Retrieval
Wei, Longhui
Zhang, Shiliang
Yao, Hantao
Gao, Wen
Tian, Qi
[J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 420 - 428
[70] Deep learning-based methods for person re-identification: A comprehensive review
Wu, Di
Zheng, Si-Jia
Zhang, Xiao-Ping
Yuan, Chang-An
Cheng, Fei
Zhao, Yang
Lin, Yong-Jun
Zhao, Zhong-Qiu
Jiang, Yong-Li
Huang, De-Shuang
[J]. NEUROCOMPUTING, 2019, 337 : 354 - 371

← 1 2 3 4 5 6 7 8 9 10 →