Fast Surveillance Video Retrieval Model Based on Tolerant Training and Privacy Protection

被引：0

作者：

Qin H. ^{[1
,2
]}

Wang P.-H. ^{[1
,2
]}

Zhang R.-F. ^{[1
,2
]}

Qin Z.-Y. ^{[3
]}

机构：

[1] School of Cyber Science and Engineering, Xi’an Jiaotong University, Xi’an

[2] Ministry of Education Key Laboratory for Intelligent Networks and Network Security, Xi’an Jiaotong University, Xi’an

[3] School of Software Engineering, Xi’an Jiaotong University, Xi’an

来源：

Ruan Jian Xue Bao/Journal of Software | 2023年 / 34卷 / 03期

关键词：

curriculum learning; knowledge distillation; privacy protection; video retrieval;

D O I：

10.13328/j.cnki.jos.006790

中图分类号：

学科分类号：

摘要：

Surveillance video keyframe retrieval and attribute search have many application scenarios in traffic, security, education and other fields. The application of deep learning model to process massive video data to a certain extent alleviates manpower consumption, but it is characterized by privacy disclosure, large consumption of computing resources and long time. Based on the above scenarios, this study proposes a safe and fast video retrieval model for mass surveillance video. In particular, according to the characteristics of large computing power in the cloud and small scale of computing power in the surveillance camera, heavyweight model is deployed in the cloud, and the proposed tolerance training strategy is used for customized knowledge distillation, the distilled lightweight model is then deployed inside a surveillance camera, at the same time using local encryption algorithm to encrypt sensitive to image part, combined with cloud TEE technology and user authorization mechanism, privacy protection can be achieved with very low resource consumption. By reasonably controlling the “tolerance” of distillation strategy, the time-consuming of camera video input stage and cloud retrieval stage can be balanced, and extremely low retrieval delay is ensured on the premise of extremely high accuracy. Compared with traditional retrieval methods, the proposed model has the characteristics of security, efficiency, scalability and low latency. Experimental results show that the proposed model provides 9×-133×acceleration compared with traditional retrieval methods on multiple open data sets. © 2023 Chinese Academy of Sciences. All rights reserved.

引用

页码：1292 / 1309

页数：17

共 52 条

[1] He K, Zhang X, Ren S, Sun J., Deep residual learning for image recognition, Proc. of the IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, pp. 770-778, (2016)
[2] Redmon J, Divvala S, Girshick R, Farhadi A., You only look once: Unified, real-time object detection, Proc. of the IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, pp. 779-788, (2016)
[3] Simonyan K, Zisserman A., Very deep convolutional networks for large-scale image recognition, Proc. of the 3rd Int’l Conf. on Learning Representations (ICLR 2015), pp. 1-14, (2015)
[4] Jia Z, Maggioni M, Staiger B, Scarpazza DP., Dissecting the nvidia Volta GPU architecture via microbenchmarking, (2018)
[5] Devlin J, Chang MW, Lee K, Toutanova K., BERT: Pre-training of deep bidirectional transformers for language understanding, Proc. of the Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2019), 1, pp. 4171-4186, (2019)
[6] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I., Attention is all you need, Proc. of the Advances in Neural Information Processing Systems, pp. 5998-6008, (2017)
[7] Zhou D, Fremont V, Quost B, Dai Y, Li H., Moving object detection and segmentation in urban environments fro m a moving platform, Image and Vision Computing, 68, pp. 76-87, (2017)
[8] Zhang K, Zhang Z, Li Z, Qiao Y., Joint face detection and alignment using multitask cascaded convolutional networks, IEEE Signal Processing Letters, 23, 10, pp. 1499-1503, (2016)
[9] Dufaux F, Ebrahimi T., A framework for the validation of privacy protection solutions in video surveillance, Proc. of the IEEE Int’l Conf. on Multimedia and Expo (ICME 2010), pp. 66-71, (2010)
[10] Upmanyu M, Namboodiri AM, Srinathan K, Jawahar CV., Efficient privacy preserving video surveillance, Proc. of the IEEE Int’l Conf. on Computer Vision, pp. 1639-1646, (2009)

← 1 2 3 4 5 6 →