RGB-D Action Recognition: Recent Advances and Future Perspectives

被引:0
作者
Hu J.-F. [1 ,2 ,3 ]
Wang X.-H. [4 ]
Zheng W.-S. [1 ,2 ,3 ]
Lai J.-H. [1 ,2 ,3 ]
机构
[1] School of Data and Computer Science, Sun Yat-sen University, Guangzhou
[2] Guangdong Province Key Laboratory of Computational Science, Guangzhou
[3] Key Laboratory of Machine Intelligence and Advanced Computing, Ministry of Education, Guangzhou
[4] School of Electronics and Information Technology, Sun Yat-sen University, Guangzhou
来源
Zidonghua Xuebao/Acta Automatica Sinica | 2019年 / 45卷 / 05期
基金
中国国家自然科学基金;
关键词
Action recognition; Deep learning; RGB-D; Skeleton;
D O I
10.16383/j.aas.c180436
中图分类号
学科分类号
摘要
Action recognition is an important research topic in computer vision, which is critical in some real-world applications including security monitoring, robot design, self driving and smart home system etc.. The existing single modality RGB based action recognition approaches are easily suffered from the illumination variation, background clutter, which leads to an inferior recognition performance. The emergence of low-cost RGB-D cameras opens a new dimension for addressing the problem of action recognition. It can overcome the drawbacks of single modality by outputting RGB, depth, and skeleton modalities, each of which can describe actions from one perspective. In this paper, we mainly review the current advances in RGB-D action recognition. Firstly, we briefly introduce some datasets popularly used in the research of RGB-D action recognition, then we review the literatures and the state-of-the-art recognition models based on convolution neural network (CNN) and recurrent neural network (RNN). Finally, we discuss the advantages and disadvantages of these methods through the experiments on three datasets and provide some problems needing addressing in the future. Copyright © 2019 Acta Automatica Sinica. All rights reserved.
引用
收藏
页码:829 / 840
页数:11
相关论文
共 70 条
  • [1] Hu J.F., Zheng W.S., Lai J.H., Zhang J.G., Jointly learning heterogeneous features for RGB-D activity recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 11, pp. 2186-2200, (2017)
  • [2] Wang J., Liu Z.C., Chorowski J., Chen Z.Y., Wu Y., Robust 3D action recognition with random occupancy patterns, Proceedings of the 12th European Conference on Computer Vision, pp. 872-885, (2012)
  • [3] Liu Z., Dong S.-D., Study of human action recognition by using skeleton motion information in depth video, Computer Applications and Software, 34, 2, (2017)
  • [4] Wang S.-T., Zhou Z., Qu H.-B., Li B., Bayesian saliency detection for RGB-D images, Acta Automatica Sinica, 43, 10, pp. 1810-1828, (2017)
  • [5] Wang X., Wo B.-H., Guan Q., Human action recognition based on manifold learning, Chinese Journal of Image and Graphics, 19, 6, pp. 914-923, (2014)
  • [6] Liu X., Xu H.-R., Hu Z.-Y., GPU based fast 3D object modeling with Kinect, Acta Automatica Sinica, 38, 8, pp. 1288-1297, (2012)
  • [7] Wang L., Hu W.-M., Tan T.-N., A survey of visual analysis of human motion, Chinese Journal of Computers, 25, 3, pp. 225-237, (2002)
  • [8] Gkioxari G., Girshick R., Dollar P., He K., Detecting and Recognizing Human-Object Interactions, Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, (2018)
  • [9] Klaser A., Marszalek M., Schmid C., A spatio-temporal descriptor based on 3D-gradients, Proceedings of the 2008 British Machine Vision Conference, (2008)
  • [10] Lowe D.G., Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, 60, 2, pp. 91-110, (2004)