Action Recognition from RGB-D Data: Comparison and fusion of spatio-temporal handcrafted features and deep strategies

被引：16

作者：

Asadi-Aghbolaghi, Maryam ^{[1
]}

Bertiche, Hugo ^{[2
]}

Roig, Vicent ^{[2
]}

Kasaei, Shohreh ^{[1
]}

Escalera, Sergio ^{[2
,3
]}

机构：

[1] Sharif Univ Tech, Dept Comp Engn, Tehran, Iran

[2] Univ Barcelona, Barcelona, Spain

[3] Comp Vis Ctr, Barcelona, Spain

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017) | 2017年

关键词：

DESCRIPTORS; SEQUENCES;

D O I：

10.1109/ICCVW.2017.376

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work, multimodal fusion of RGB-D data are analyzed for action recognition by using scene flow as early fusion and integrating the results of all modalities in a late fusion fashion. Recently, there is a migration from traditional handcrafting to deep learning. However, handcrafted features are still widely used owing to their high performance and low computational complexity. In this research, Multimodal dense trajectories (MMDT) is proposed to describe RGB-D videos. Dense trajectories are pruned based on scene flow data. Besides, 2DCNN is extended to multimodal (MM2DCNN) by adding one more stream (scene flow) as input and then fusing the output of all models. We evaluate and compare the results from each modality and their fusion on two action datasets. The experimental result shows that the new representation improves the accuracy. Furthermore, the fusion of handcrafted and learning-based features shows a boost in the final performance, achieving state of the art results.

引用

页码：3179 / 3188

页数：10

共 46 条

[1]

Amiri SM, 2014, INT CONF COMPUT NETW, P363, DOI 10.1109/ICCNC.2014.6785361

[2]

[Anonymous], 2015, PROC CVPR IEEE, DOI 10.1109/CVPR.2015.7299016

[3]

[Anonymous], 2012, P ACM INT C MULT NAR, DOI DOI 10.1145/2393347.2396382

[4]

Asadi-Aghbolaghi M, 2017, SPRING SER CHALLENGE, P539, DOI 10.1007/978-3-319-57021-1_19

[5] Supervised spatio-temporal kernel descriptor for human action recognition from RGB-depth videos [J].

Asadi-Aghbolaghi, Maryam ;

Kasaei, Shohreh .

MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (11) :14115-14135

[6] A survey on deep learning based approaches for action and gesture recognition in image sequences [J].

Asadi-Aghbolaghi, Maryam ;

Clapes, Albert ;

Bellantonio, Marco ;

Escalante, Hugo Jair ;

Ponce-Lopez, Victor ;

Baro, Xavier ;

Guyon, Isabelle ;

Kasaei, Shohreh ;

Escalera, Sergio .

2017 12TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2017), 2017, :476-483

[7]

Chaudhry R, 2009, PROC CVPR IEEE, P1932, DOI 10.1109/CVPRW.2009.5206821

[8] Action recognition from depth sequences using weighted fusion of 2D and 3D auto-correlation of gradients features [J].

Chen, Chen ;

Zhang, Baochang ;

Hou, Zhenjie ;

Jiang, Junjun ;

Liu, Mengyuan ;

Yang, Yun .

MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (03) :4651-4669

[9]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[10]

Du Y, 2015, PROC CVPR IEEE, P1110, DOI 10.1109/CVPR.2015.7298714

← 1 2 3 4 5 →