Human Action Recognition Based on Transfer Learning Approach

被引:34
作者
Abdulazeem, Yousry [1 ]
Balaha, Hossam Magdy [2 ]
Bahgat, Waleed M. [3 ]
Badawy, Mahmoud [2 ]
机构
[1] Misr Higher Inst Engn & Technol, Comp Engn Dept, Mansoura 35516, Egypt
[2] Mansoura Univ, Fac Engn, Comp & Syst Engn Dept, Mansoura 35511, Egypt
[3] Mansoura Univ, Fac Comp & Informat Sci, Informat Technol Dept, Mansoura 35511, Egypt
关键词
Transfer learning; Feature extraction; Three-dimensional displays; Solid modeling; Deep learning; Computer architecture; Training; Convolutional neural network (CNN); human action recognition (HAR); long short-term memory (LSTM); spatiotemporal info; transfer learning (TL); DROPOUT; LSTM;
D O I
10.1109/ACCESS.2021.3086668
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human action recognition techniques have gained significant attention among next-generation technologies due to their specific features and high capability to inspect video sequences to understand human actions. As a result, many fields have benefited from human action recognition techniques. Deep learning techniques played a primary role in many approaches to human action recognition. The new era of learning is spreading by transfer learning. Accordingly, this study's main objective is to propose a framework with three main phases for human action recognition. The phases are pre-training, preprocessing, and recognition. This framework presents a set of novel techniques that are three-fold as follows, (i) in the pre-training phase, a standard convolutional neural network is trained on a generic dataset to adjust weights; (ii) to perform the recognition process, this pre-trained model is then applied to the target dataset; and (iii) the recognition phase exploits convolutional neural network and long short-term memory to apply five different architectures. Three architectures are stand-alone and single-stream, while the other two are combinations between the first three in two-stream style. Experimental results show that the first three architectures recorded accuracies of 83.24%, 90.72%, and 90.85%, respectively. The last two architectures achieved accuracies of 93.48% and 94.87%, respectively. Moreover, The recorded results outperform other state-of-the-art models in the same field.
引用
收藏
页码:82058 / 82069
页数:12
相关论文
共 73 条
[1]   A Review on Computer Vision-Based Methods for Human Action Recognition [J].
Al-Faris, Mahmoud ;
Chiverton, John ;
Ndzi, David ;
Ahmed, Ahmed Isam .
JOURNAL OF IMAGING, 2020, 6 (06)
[2]  
[Anonymous], 2014, Comput. Sci.
[3]  
[Anonymous], 2014, arXiv
[4]  
[Anonymous], 2016, P ICLR
[5]   Human action recognition with bag of visual words using different machine learning methods and hyperparameter optimization [J].
Aslan, Muhammet Fatih ;
Durdu, Akif ;
Sabanci, Kadir .
NEURAL COMPUTING & APPLICATIONS, 2020, 32 (12) :8585-8597
[6]  
Babiker Mohanad., 2017, 2017 IEEE 4 INT C SM, P1
[7]   Dynamic Image Networks for Action Recognition [J].
Bilen, Hakan ;
Fernando, Basura ;
Gavves, Efstratios ;
Vedaldi, Andrea ;
Gould, Stephen .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3034-3042
[8]  
Bjorck N., 2018, ADV NEURAL INFORM PR, P7694
[9]  
Bo Y, 2020, IEEE WINT CONF APPL, P584, DOI [10.1109/wacv45572.2020.9093481, 10.1109/WACV45572.2020.9093481]
[10]   High accuracy optical flow estimation based on a theory for warping [J].
Brox, T ;
Bruhn, A ;
Papenberg, N ;
Weickert, J .
COMPUTER VISION - ECCV 2004, PT 4, 2004, 2034 :25-36