3DFCNN: real-time action recognition using 3D deep neural networks with raw depth information

被引:37
作者
Sanchez-Caballero, Adrian [1 ]
de Lopez-Diz, Sergio [1 ]
Fuentes-Jimenez, David [1 ]
Losada-Gutierrez, Cristina [1 ]
Marron-Romera, Marta [1 ]
Casillas-Perez, David [2 ]
Sarker, Mohammad Ibrahim [1 ]
机构
[1] Univ Alcala, Dept Elect, Km 33600, Alcala De Henares 28805, Spain
[2] Univ Rey Juan Carlos, Dept Signal Proc & Commun, Madrid, Spain
关键词
3D-CNN; Action Recognition; Depth Maps; Real-time; Video-surveillance; RGB-D;
D O I
10.1007/s11042-022-12091-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This work describes an end-to-end approach for real-time human action recognition from raw depth image-sequences. The proposal is based on a 3D fully convolutional neural network, named 3DFCNN, which automatically encodes spatio-temporal patterns from raw depth sequences. The described 3D-CNN allows actions classification from the spatial and temporal encoded information of depth sequences. The use of depth data ensures that action recognition is carried out protecting people's privacy, since their identities can not be recognized from these data. The proposed 3DFCNN has been optimized to reach a good performance in terms of accuracy while working in real-time. Then, it has been evaluated and compared with other state-of-the-art systems in three widely used public datasets with different characteristics, demonstrating that 3DFCNN outperforms all the non-DNN-based state-of-the-art methods with a maximum accuracy of 83.6% and obtains results that are comparable to the DNN-based approaches, while maintaining a much lower computational cost of 1.09 seconds, what significantly increases its applicability in real-world environments.
引用
收藏
页码:24119 / 24143
页数:25
相关论文
共 99 条
  • [1] Human Action Recognition based on 3D Convolution Neural Networks from RGBD Videos
    Al-Akam, Rawya
    Paulus, Dietrich
    Gharabaghi, Darius
    [J]. 26. INTERNATIONAL CONFERENCE IN CENTRAL EUROPE ON COMPUTER GRAPHICS, VISUALIZATION AND COMPUTER VISION (WSCG 2018), 2018, 2803 : 18 - 26
  • [2] [Anonymous], 2015, PROC CVPR IEEE
  • [3] View invariant action recognition using projective depth
    Ashraf, Nazim
    Sun, Chuan
    Foroosh, Hassan
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2014, 123 : 41 - 52
  • [4] Baptista-Rios M., 2016, 2016 INT C IND POS I, P1, DOI DOI 10.1109/IPIN.2016.7743617
  • [5] A survey of video datasets for human action and activity recognition
    Chaquet, Jose M.
    Carmona, Enrique J.
    Fernandez-Caballero, Antonio
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2013, 117 (06) : 633 - 659
  • [6] A survey of depth and inertial sensor fusion for human action recognition
    Chen, Chen
    Jafari, Roozbeh
    Kehtarnavaz, Nasser
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (03) : 4405 - 4425
  • [7] Real-time human action recognition based on depth motion maps
    Chen, Chen
    Liu, Kui
    Kehtarnavaz, Nasser
    [J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2016, 12 (01) : 155 - 163
  • [8] Cheng ZW, 2012, LECT NOTES COMPUT SC, V7584, P52, DOI 10.1007/978-3-642-33868-7_6
  • [9] Robust Feature-Based Automated Multi-View Human Action Recognition System
    Chou, Kuang-Pen
    Prasad, Mukesh
    Wu, Di
    Sharma, Nabin
    Li, Dong-Lin
    Lin, Yu-Feng
    Blumenstein, Michael
    Lin, Wen-Chieh
    Lin, Chin-Teng
    [J]. IEEE ACCESS, 2018, 6 : 15283 - 15296
  • [10] Das S, 2019, NEW HYBRID ARCHITECT