The Progress of Human Action Recognition in Videos Based on Deep Learning: A Review

被引:0
作者
Luo H.-L. [1 ]
Tong K. [1 ]
Kong F.-S. [2 ]
机构
[1] School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, 341000, Jiangxi
[2] School of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, Zhejiang
来源
Tien Tzu Hsueh Pao/Acta Electronica Sinica | 2019年 / 47卷 / 05期
关键词
Action recognition; Convolutional neural network; Deep learning; Review;
D O I
10.3969/j.issn.0372-2112.2019.05.025
中图分类号
学科分类号
摘要
Human action recognition in videos is a challenging topic in the field of computer vision. It is widely not only used in video information retrieval, daily life security, public video surveillance, but also human-computer interaction, scientific cognition and other fields. First, the research background, research significance and difficulties of action recognition are briefly introduced, and then the deep learning model based action recognition methods are comprehensively reviewed from three different aspects: the types and numbers of input signals, the combination with traditional feature extraction methods, and the pre-trained datasets. Furthermore, the performances of some typical methods on UCF101 and HMDB51 datasets are overviewed and analyzed. Last the possible future research directions are discussed from three perspectives: the video data preprocessing, the video human motion feature representation, and the model training. © 2019, Chinese Institute of Electronics. All right reserved.
引用
收藏
页码:1162 / 1173
页数:11
相关论文
共 68 条
  • [1] Hu Q., Qin L., Huang Q., Overview of human action recognition based on vision, Chinese Journal of Computers, 36, 12, pp. 2512-2524, (2013)
  • [2] Poppe R., A survey on vision-based human action recognition, Image and Vision Computing, 28, 6, pp. 976-990, (2010)
  • [3] Weinland D., Ronfard R., Boyer E., A survey of vision-based methods for action representation, segmentation and recognition, Computer Vision and Image Understanding, 115, 2, pp. 224-241, (2011)
  • [4] Du Y.-T., Chen F., Xu W.-L., Li Y.-B., A survey on the vision-based human motion recognition, Acta Electronica Sinica, 35, 1, pp. 84-90, (2007)
  • [5] Chaquet J.M., Carmona E.J., Fernandez-Caballero A., A survey of video datasets for human action and activity recognition, Computer Vision and Image Understanding, 117, 6, pp. 633-659, (2013)
  • [6] Dawn D.D., Shaikh S.H., A comprehensive survey of human action recognition with spatio-temporal interest point (STIP) detector, Visual Computer, 32, 3, pp. 289-306, (2016)
  • [7] Zhu H.-L., Zhu C.-S., Xu Z.-G., Research progress on human action recognition datasets, Acta Automatica Sinica, 44, 6, pp. 978-1004, (2018)
  • [8] Zhu F., Shao L., Xie J., Et al., From handcrafted to learned representations for human action recognition, Image and Vision Computing, 55, P2, pp. 42-52, (2016)
  • [9] Herath S., Harandi M., Porikli F., Going deeper into action recognition: a survey, Image and Vision Computing, 60, 4, pp. 4-21, (2017)
  • [10] Sargano A., Angelov P., Habib Z., A comprehensive review on handcrafted and learning-based action representation approaches for human activity recognition, Applied Sciences, 7, 1, pp. 110-147, (2017)