3DFCNN: real-time action recognition using 3D deep neural networks with raw depth information

被引：37

作者：

Sanchez-Caballero, Adrian ^{[1
]}

de Lopez-Diz, Sergio ^{[1
]}

Fuentes-Jimenez, David ^{[1
]}

Losada-Gutierrez, Cristina ^{[1
]}

Marron-Romera, Marta ^{[1
]}

Casillas-Perez, David ^{[2
]}

Sarker, Mohammad Ibrahim ^{[1
]}

机构：

[1] Univ Alcala, Dept Elect, Km 33600, Alcala De Henares 28805, Spain

[2] Univ Rey Juan Carlos, Dept Signal Proc & Commun, Madrid, Spain

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2022年 / 81卷 / 17期

关键词：

3D-CNN; Action Recognition; Depth Maps; Real-time; Video-surveillance; RGB-D;

D O I：

10.1007/s11042-022-12091-z

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This work describes an end-to-end approach for real-time human action recognition from raw depth image-sequences. The proposal is based on a 3D fully convolutional neural network, named 3DFCNN, which automatically encodes spatio-temporal patterns from raw depth sequences. The described 3D-CNN allows actions classification from the spatial and temporal encoded information of depth sequences. The use of depth data ensures that action recognition is carried out protecting people's privacy, since their identities can not be recognized from these data. The proposed 3DFCNN has been optimized to reach a good performance in terms of accuracy while working in real-time. Then, it has been evaluated and compared with other state-of-the-art systems in three widely used public datasets with different characteristics, demonstrating that 3DFCNN outperforms all the non-DNN-based state-of-the-art methods with a maximum accuracy of 83.6% and obtains results that are comparable to the DNN-based approaches, while maintaining a much lower computational cost of 1.09 seconds, what significantly increases its applicability in real-world environments.

引用

页码：24119 / 24143

页数：25

共 99 条

[1] Human Action Recognition based on 3D Convolution Neural Networks from RGBD Videos
Al-Akam, Rawya
Paulus, Dietrich
Gharabaghi, Darius
[J]. 26. INTERNATIONAL CONFERENCE IN CENTRAL EUROPE ON COMPUTER GRAPHICS, VISUALIZATION AND COMPUTER VISION (WSCG 2018), 2018, 2803 : 18 - 26
[2] [Anonymous], 2015, PROC CVPR IEEE
[3] View invariant action recognition using projective depth
Ashraf, Nazim
Sun, Chuan
Foroosh, Hassan
[J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2014, 123 : 41 - 52
[4] Baptista-Rios M., 2016, 2016 INT C IND POS I, P1, DOI DOI 10.1109/IPIN.2016.7743617
[5] A survey of video datasets for human action and activity recognition
Chaquet, Jose M.
Carmona, Enrique J.
Fernandez-Caballero, Antonio
[J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2013, 117 (06) : 633 - 659
[6] A survey of depth and inertial sensor fusion for human action recognition
Chen, Chen
Jafari, Roozbeh
Kehtarnavaz, Nasser
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (03) : 4405 - 4425
[7] Real-time human action recognition based on depth motion maps
Chen, Chen
Liu, Kui
Kehtarnavaz, Nasser
[J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2016, 12 (01) : 155 - 163
[8] Cheng ZW, 2012, LECT NOTES COMPUT SC, V7584, P52, DOI 10.1007/978-3-642-33868-7_6
[9] Robust Feature-Based Automated Multi-View Human Action Recognition System
Chou, Kuang-Pen
Prasad, Mukesh
Wu, Di
Sharma, Nabin
Li, Dong-Lin
Lin, Yu-Feng
Blumenstein, Michael
Lin, Wen-Chieh
Lin, Chin-Teng
[J]. IEEE ACCESS, 2018, 6 : 15283 - 15296
[10] Das S, 2019, NEW HYBRID ARCHITECT

← 1 2 3 4 5 6 7 8 9 10 →