Action recognition method based on a novel keyframe extraction method and enhanced 3D convolutional neural network

被引：1

作者：

Tian, Qiuhong ^{[1
]}

Li, Saiwei ^{[1
]}

Zhang, Yuankui ^{[1
]}

Lu, Hongyi ^{[1
]}

Pan, Hao ^{[1
]}

机构：

[1] Zhejiang Sci Tech Univ, Hangzhou 310018, Zhejiang, Peoples R China

来源：

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS | 2025年 / 16卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Action recognition; 3D attention mechanism; Keyframe extraction; 3D residual structure;

D O I：

10.1007/s13042-024-02235-y

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

At present, action recognition is a challenging task in the field of computer vision. Traditional action recognition methods cannot fully extract the spatiotemporal features of actions in video. To address the problem, an action recognition method based on keyframe extraction and DAMR_3DNet (D3DNet+3D Attention Mechanism module+3D Residual module) is proposed. Firstly, we explore a keyframe extraction method based on image information entropy and hog_ssim similarity algorithm, which selects keyframes from the input video to represent video content. And we take the selected keyframes as the model input to reduce the computational complexity of network model. Afterward, we design a DAMR_3DNet model to recognize action and reduce the parameters of network. The D3DNet module improves the C3D network by using the 3D decoupled convolution substituting the 3D convolution and introducing a feature fusion layer. And a 3D attention mechanism is designed to strengthen the action features and reduce the influence of background features. Finally, a 3D residual structure is explored to avoid gradient disappearance while fusing the high-level and low-level spatiotemporal features. Experiments consistently show the superiority of the proposed method on UCF101, Chinese sign language (CSL) and HMDB51 datasets. And the results demonstrate that the proposed method is effective, which improves the performance of action recognition and outperforms the most state-of-the-art methods.

引用

页码：475 / 491

页数：17

共 50 条

[31] Skeleton-Guided Action Recognition with Multistream 3D Convolutional Neural Network for Elderly-Care Robot
Zhang, Dawei
Zhang, Yanmin
Zhou, Meng
ADVANCED INTELLIGENT SYSTEMS, 2023, 5 (12)
[32] A Voxel-Based 3D reconstruction and action recognition method for construction workers
Zhang, Jin
Wang, Daoming
An, Xuehui
Lv, Miao
Chen, Dexing
Sun, Aoran
ADVANCED ENGINEERING INFORMATICS, 2025, 65
[33] Adaptive shift graph convolutional neural network for hand gesture recognition based on 3D skeletal similarity
Bulugu, Isack
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, : 7583 - 7595
[34] Adaptive shift graph convolutional neural network for hand gesture recognition based on 3D skeletal similarity
Bulugu, Isack
SIGNAL IMAGE AND VIDEO PROCESSING, 2024,
[35] Sports Action Pattern Recognition Method Based on Fuzzy Neural Network Theory
Sun, Keshuang
2023 3RD ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE, ACCTCS, 2023, : 217 - 223
[36] Novel method for the recognition of Jinnan cattle action using bottleneck attention enhanced two-stream neural network
Hao, Wangli
Han, Meng
Zhang, Kai
Zhang, Li
Hao, Wangbao
Li, Fuzhong
Liu, Zhenyu
INTERNATIONAL JOURNAL OF AGRICULTURAL AND BIOLOGICAL ENGINEERING, 2024, 17 (03) : 203 - 210
[37] Action Recognition Using High Temporal Resolution 3D Neural Network Based on Dilated Convolution
Xu, Yongyang
Feng, Yaxing
Xie, Zhong
Xie, Mingyu
Luo, Wei
IEEE ACCESS, 2020, 8 : 165365 - 165372
[38] An efficient 3D convolutional neural network with informative 3D volumes for human activity recognition using wearable sensors‏
Saeedeh Zebhi
Multimedia Tools and Applications, 2024, 83 : 42233 - 42256
[39] An efficient 3D convolutional neural network with informative 3D volumes for human activity recognition using wearable sensors
Zebhi, Saeedeh
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (14) : 42233 - 42256
[40] Action recognition with motion map 3D network
Sun, Yuchao
Wu, Xinxiao
Yu, Wennan
Yu, Feiwu
NEUROCOMPUTING, 2018, 297 : 33 - 39

← 1 2 3 4 5 →