Modeling Temporal Visual Salience for Human Action Recognition Enabled Visual Anonymity Preservation

被引:4
作者
Al-Obaidi, Salah [1 ]
Al-Khafaji, Hiba [1 ]
Abhayaratne, Charith [1 ]
机构
[1] Univ Sheffield, Dept Elect & Elect Engn, Sheffield S1 3JD, S Yorkshire, England
关键词
Visual anonymization; human action recognition; histogram of gradients in salience (HOG-S); temporal visual salience estimation; privacy; video-based monitoring; assisted living; HISTOGRAMS; PRIVACY; ROBUST; LSTM;
D O I
10.1109/ACCESS.2020.3039740
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a novel approach for visually anonymizing video clips while retaining the ability to machine-based analysis of the video clip, such as, human action recognition. The visual anonymization is achieved by proposing a novel method for generating the anonymization silhouette by modeling the frame-wise temporal visual salience. This is followed by analysing these temporal salience-based silhouettes by extracting the proposed histograms of gradients in salience (HOG-S) for learning the action representation in the visually anonymized domain. Since the anonymization maps are based on the temporal salience maps represented in gray scale, only the moving body parts related to the motion of the action are represented in larger gray values forming highly anonymized silhouettes, resulting in the highest mean anonymity score (MAS), the least identifiable visual appearance attributes and a high utility of human-perceived utility in action recognition. In terms of machine-based human action recognition, using the proposed HOG-S features has resulted in the highest accuracy rate in the anonymized domain compared to those achieved from the existing anonymization methods. Overall, the proposed holistic human action recognition method, i.e., the temporal salience modeling followed by the HOG-S feature extraction, has resulted in the best human action recognition accuracy rates for datasets DHA, KTH, UIUC1, UCF Sports and HMDB51 with improvements of 3%, 1.6%, 0.8%, 1.3% and 16.7%, respectively. The proposed method outperforms both feature-based and deep learning based existing approaches.
引用
收藏
页码:213806 / 213824
页数:19
相关论文
共 82 条
[11]  
Chen J., 2019, THESIS
[12]   PoTion: Pose MoTion Representation for Action Recognition [J].
Choutas, Vasileios ;
Weinzaepfel, Philippe ;
Revaud, Jerome ;
Schmid, Cordelia .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7024-7033
[13]   Human action recognition using two-stream attention based LSTM networks [J].
Dai, Cheng ;
Liu, Xingang ;
Lai, Jinfeng .
APPLIED SOFT COMPUTING, 2020, 86
[14]   Monitoring Activities of Daily Living in Smart Homes Understanding human behavior [J].
Debes, Christian ;
Merentitis, Andreas ;
Sukhanov, Sergey ;
Niessen, Maria ;
Frangiadakis, Nicolaos ;
Bauer, Alexander .
IEEE SIGNAL PROCESSING MAGAZINE, 2016, 33 (02) :81-94
[15]  
Del Signore AG, 2020, HEAD AND NECK CANCER: MANAGEMENT AND RECONSTRUCTION, 2ND EDITION, P304
[16]  
Edgcomb Alex, 2012, ACM SIGHIT Record, V2, P6, DOI [10.1145/2384556.2384557, DOI 10.1145/2384556.2384557]
[17]   Behavior analysis for elderly care using a network of low-resolution visual sensors [J].
Eldib, Mohamed ;
Deboeverie, Francis ;
Philips, Wilfried ;
Aghajana, Hamid .
JOURNAL OF ELECTRONIC IMAGING, 2016, 25 (04)
[18]   Video Saliency Incorporating Spatiotemporal Cues and Uncertainty Weighting [J].
Fang, Yuming ;
Wang, Zhou ;
Lin, Weisi ;
Fang, Zhijun .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (09) :3910-3921
[19]   A Unified Framework for Activity Recognition-Based Behavior Analysis and Action Prediction in Smart Homes [J].
Fatima, Iram ;
Fahim, Muhammad ;
Lee, Young-Koo ;
Lee, Sungyoung .
SENSORS, 2013, 13 (02) :2682-2699
[20]   Multi-perspective and multi-modality joint representation and recognition model for 3D action recognition [J].
Gao, Z. ;
Zhang, H. ;
Xu, G. P. ;
Xue, Y. B. .
NEUROCOMPUTING, 2015, 151 :554-564