SDIGRU: Spatial and Deep Features Integration Using Multilayer Gated Recurrent Unit for Human Activity Recognition

被引：29

作者：

Ahmad, Tariq ^{[1
]}

Wu, Jinsong ^{[2
,3
]}

机构：

[1] Guilin Univ Elect Technol, Sch Informat & Commun Engn, Guilin 541004, Peoples R China

[2] Guilin Univ Elect Technol, Sch Artificial Intelligence, Guilin 510004, Peoples R China

[3] Univ Chile, Dept Elect Engn, Santiago 8370451, Chile

来源：

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS | 2024年 / 11卷 / 01期

关键词：

Convolution neural networks (CNNs); gated recurrent unit (GRU); human activity recognition; FUSION; NETWORK; VIDEOS; CNN;

D O I：

10.1109/TCSS.2023.3249152

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Smart video surveillance plays a significant role in public security via storing a huge amount of continual stream data, evaluates them, and generates warns where undesirable human activities are performed. Recognition of human activities in video surveillance has faced many challenges such as optimal evaluation of human activities under growing volumes of streaming data with complex computation and huge time processing complexity. To tackle these challenges we introduce a lightweighted spatial-deep features integration using multilayer GRU (SDIGRU). First, we extract spatial and deep features from frames sequence of realistic human activity videos via utilizing a lightweight MobileNetV2 model and then integrate those spatial-deep features. Although deep features can be used for human activity recognition, they contain only the high-level appearance, which is insufficient to correctly identify the particular activity of human. Thus, we jointly apply deep information with spatial appearance to produce detailed level information. Furthermore, we select rich informative features from spatial-deep appearances. Then, we train multilayer gated recurrent unit (GRU) and feed informative features to learn the temporal dynamics of human activity frames sequence at each time step of GRU. We conduct our experiments on benchmark YouTube11, HMDB51, and UCF101 datasets of human activity recognition. The empirical results show that our method achieved significant recognition performance with low computational complexity and quick response. Finally, we compare the results with existing state-of-the-art techniques, which show the effectiveness of our method.

引用

页码：973 / 985

页数：13

共 50 条

[1]

Ahmad T, 2021, INT J ADV COMPUT SC, V12, P18

[2] Hierarchical Clustering Multi-Task Learning for Joint Human Action Grouping and Recognition [J].

Liu, An-An ;

Su, Yu-Ting ;

Nie, Wei-Zhi ;

Kankanhalli, Mohan .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (01) :102-114

[3]

[Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.214

[4]

Ballas N., 2015, ARXIV

[5]

Bazzani L., 2016, ARXIV

[6]

Bregonzio M, 2009, PROC CVPR IEEE, P1948, DOI 10.1109/CVPRW.2009.5206779

[7] Generalized Rank Pooling for Activity Recognition [J].

Cherian, Anoop ;

Fernando, Basura ;

Harandi, Mehrtash ;

Gould, Stephen .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1581-1590

[8]

Cho K, 2014, ARXIV

[9]

Chollet F., 2016, arXiv

[10]

Couturier R., 2021, ARXIV

← 1 2 3 4 5 →