Efficient lightweight video person re-identification with online difference discrimination module

被引：3

作者：

Gao, Cunyuan ^{[1
]}

Yao, Rui ^{[1
,2
]}

Zhou, Yong ^{[1
]}

Zhao, Jiaqi ^{[1
]}

Fang, Liang ^{[1
]}

Hu, Fuyuan ^{[2
]}

机构：

[1] China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China

[2] Suzhou Univ Sci & Technol, Suzhou Smart City Res Inst, Suzhou 215009, Peoples R China

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2022年 / 81卷 / 14期

基金：

中国国家自然科学基金;

关键词：

Video person re-identification; Spatial attention; Video temporal modeling;

D O I：

10.1007/s11042-021-10543-6

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Video person re-identification (video Re-ID) is a key technology applied to video surveillance and security. Typical person re-identification is designed to retrieve the correct match of the target image (query) from gallery images, while video Re-ID extends this to query from gallery videos. The main factors affecting the video Re-ID model are: (i) a high-quality frame-level feature extractor, and (ii) temporal modeling that combines frame-level features into a feature for retrieval. In this work, we use ShuffleNet V2-based lightweight algorithm for video Re-ID, which can meet the demand for practical application and solve the problem of high consumption for computing resources, and maintain high performance. At the same time, the lightweight space attention mechanism Spatial Group-wise Enhance (SGE) module is used to view the person in more detail, which makes the feature representation more compact and effectively improves the retrieval accuracy. Finally, we design an Online Difference Discrimination (ODD) module to measure the feature gap between video frames, and use this module to make different temporal modeling for different quality video sequences. Experiments on three datasets (i.e., iLIDS-VID, PRID2011 and MARS) show that our method is competitive with state-of-the-art methods.

引用

页码：19169 / 19181

页数：13

共 40 条

[1]

Ahmed S, 2019, ARXIV190204856

[2] Supervised Transformer Network for Efficient Face Detection [J].

Chen, Dong ;

Hua, Gang ;

Wen, Fang ;

Sun, Jian .

COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :122-138

[3] The improved image inpainting algorithm via encoder and similarity constraint [J].

Chen, Yuantao ;

Liu, Linwu ;

Tao, Jiajun ;

Xia, Runlong ;

Zhang, Qian ;

Yang, Kai ;

Xiong, Jie ;

Chen, Xi .

VISUAL COMPUTER, 2021, 37 (07) :1691-1705

[4] RETRACTED: The visual object tracking algorithm research based on adaptive combination kernel (Retracted Article) [J].

Chen, Yuantao ;

Wang, Jin ;

Xia, Runlong ;

Zhang, Qian ;

Cao, Zhouhong ;

Yang, Kai .

JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2019, 10 (12) :4855-4867

[5] Video Person Re-Identification by Temporal Residual Learning [J].

Dai, Ju ;

Zhang, Pingping ;

Wang, Dong ;

Lu, Huchuan ;

Wang, Hongyu .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (03) :1366-1377

[6] SiamCAR: Siamese Fully Convolutional Classification and Regression for Visual Tracking [J].

Guo, Dongyan ;

Wang, Jun ;

Cui, Ying ;

Wang, Zhenhua ;

Chen, Shengyong .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :6268-6276

[7]

Hadsell R., 2006, 2006 IEEE COMP SOC C, VVolume 2, P1735

[8]

Hermans A., 2017, ARXIV PREPRINT ARXIV

[9]

Hirzer M, 2011, LECT NOTES COMPUT SC, V6688, P91, DOI 10.1007/978-3-642-21227-7_9

[10]

Iandola F. N., 2016, ARXIV160207360

← 1 2 3 4 →