Multi-view region-adaptive multi-temporal DMM and RGB action recognition

被引：9

作者：

Al-Faris, Mahmoud ^{[1
]}

Chiverton, John P. ^{[1
]}

Yang, Yanyan ^{[2
]}

Ndzi, David L. ^{[3
]}

机构：

[1] Univ Portsmouth, Sch Energy & Elect Engn, Portsmouth PO1 3DJ, Hants, England

[2] Univ Portsmouth, Sch Comp, Portsmouth PO1 3HE, Hants, England

[3] Univ West Scotland, Sch Comp Engn & Phys Sci, Paisley PA1 2BE, Renfrew, Scotland

来源：

PATTERN ANALYSIS AND APPLICATIONS | 2020年 / 23卷 / 04期

关键词：

Action recognition; DMM; 3D CNN; Region adaptive; ENSEMBLE;

D O I：

10.1007/s10044-020-00886-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Human action recognition remains an important yet challenging task. This work proposes a novel action recognition system. It uses a novel multi-view region-adaptive multi-resolution-in-time depth motion map (MV-RAMDMM) formulation combined with appearance information. Multi-stream 3D convolutional neural networks (CNNs) are trained on the different views and time resolutions of the region-adaptive depth motion maps. Multiple views are synthesised to enhance the view invariance. The region-adaptive weights, based on localised motion, accentuate and differentiate parts of actions possessing faster motion. Dedicated 3D CNN streams for multi-time resolution appearance information are also included. These help to identify and differentiate between small object interactions. A pre-trained 3D-CNN is used here with fine-tuning for each stream along with multi-class support vector machines. Average score fusion is used on the output. The developed approach is capable of recognising both human action and human-object interaction. Three public-domain data-sets, namely MSR 3D Action, Northwestern UCLA multi-view actions and MSR 3D daily activity, are used to evaluate the proposed solution. The experimental results demonstrate the robustness of this approach compared with state-of-the-art algorithms.

引用

页码：1587 / 1602

页数：16

共 71 条

[1]

Al-Faris M., 2017, IET 3 INT C INT SIGN, P1

[2] Deep Learning of Fuzzy Weighted Multi-Resolution Depth Motion Maps with Spatial Feature Fusion for Action Recognition [J].

Al-Faris, Mahmoud ;

Chiverton, John ;

Yang, Yanyan ;

Ndzi, David .

JOURNAL OF IMAGING, 2019, 5 (10)

[3] Evolutionary joint selection to improve human action recognition with RGB-D devices [J].

Andre Chaaraoui, Alexandros ;

Ramon Padilla-Lopez, Jose ;

Climent-Perez, Pau ;

Florez-Revuelta, Francisco .

EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (03) :786-794

[4]

[Anonymous], 2014, ASIAN C COMPUTER VIS

[5]

[Anonymous], 2018, FUSION APPEARANCE BA

[6]

[Anonymous], 2017, Int J Adv Intell Informatics, DOI [10.26555/ijain.v3i1.89, DOI 10.26555/IJAIN.V3I1.89]

[7]

[Anonymous], 2015, PROC 4 INT C ELECT P

[8] Action Recognition from RGB-D Data: Comparison and fusion of spatio-temporal handcrafted features and deep strategies [J].

Asadi-Aghbolaghi, Maryam ;

Bertiche, Hugo ;

Roig, Vicent ;

Kasaei, Shohreh ;

Escalera, Sergio .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, :3179-3188

[9]

Baccouche Moez, 2011, Human Behavior Unterstanding. Proceedings Second International Workshop, HBU 2011, P29, DOI 10.1007/978-3-642-25446-8_4

[10]

Baptista R, 2019, INT CONF ACOUST SPEE, P2542, DOI 10.1109/ICASSP.2019.8682904

← 1 2 3 4 5 6 7 8 →