Action recognition method based on a novel keyframe extraction method and enhanced 3D convolutional neural network

被引：1

作者：

Tian, Qiuhong ^{[1
]}

Li, Saiwei ^{[1
]}

Zhang, Yuankui ^{[1
]}

Lu, Hongyi ^{[1
]}

Pan, Hao ^{[1
]}

机构：

[1] Zhejiang Sci Tech Univ, Hangzhou 310018, Zhejiang, Peoples R China

来源：

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS | 2025年 / 16卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Action recognition; 3D attention mechanism; Keyframe extraction; 3D residual structure;

D O I：

10.1007/s13042-024-02235-y

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

At present, action recognition is a challenging task in the field of computer vision. Traditional action recognition methods cannot fully extract the spatiotemporal features of actions in video. To address the problem, an action recognition method based on keyframe extraction and DAMR_3DNet (D3DNet+3D Attention Mechanism module+3D Residual module) is proposed. Firstly, we explore a keyframe extraction method based on image information entropy and hog_ssim similarity algorithm, which selects keyframes from the input video to represent video content. And we take the selected keyframes as the model input to reduce the computational complexity of network model. Afterward, we design a DAMR_3DNet model to recognize action and reduce the parameters of network. The D3DNet module improves the C3D network by using the 3D decoupled convolution substituting the 3D convolution and introducing a feature fusion layer. And a 3D attention mechanism is designed to strengthen the action features and reduce the influence of background features. Finally, a 3D residual structure is explored to avoid gradient disappearance while fusing the high-level and low-level spatiotemporal features. Experiments consistently show the superiority of the proposed method on UCF101, Chinese sign language (CSL) and HMDB51 datasets. And the results demonstrate that the proposed method is effective, which improves the performance of action recognition and outperforms the most state-of-the-art methods.

引用

页码：475 / 491

页数：17

共 50 条

[41] Action Recognition Model Based on 3D Graph Convolution and Attention Enhanced
Cao Yi
Liu Chen
Sheng Yongjian
Huang Zilong
Deng Xiaolong
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (07) : 2071 - 2078
[42] KS-FuseNet: An Efficient Action Recognition Method Based on Keyframe Selection and Feature Fusion
Mao, Keming
Xiao, Yilong
Jing, Xin
Hu, Zepeng
Ping, Yi
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VII, 2025, 15037 : 540 - 553
[43] DDC3N: Doppler-Driven Convolutional 3D Network for Human Action Recognition
Toshpulatov, Mukhiddin
Lee, Wookey
Lee, Suan
Yoon, Hoyoung
Kang, U. Kang
IEEE ACCESS, 2024, 12 : 93546 - 93567
[44] Recognition method of basketball players’ shooting action based on graph convolution neural network
Xu J.
International Journal of Reasoning-based Intelligent Systems, 2022, 14 (04) : 227 - 232
[45] Trajectory-Pooled 3D Convolutional Descriptors for Action Recognition
Lu, Xiusheng
Yao, Hongxun
Sun, Xiaoshuai
Zhang, Shengping
Zhang, Yanhao
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT I, 2018, 10735 : 247 - 257
[46] Spatial-temporal pyramid based Convolutional Neural Network for action recognition
Zheng, Zhenxing
An, Gaoyun
Wu, Dapeng
Ruan, Qiuqi
NEUROCOMPUTING, 2019, 358 : 446 - 455
[47] Temporal Pyramid Pooling-Based Convolutional Neural Network for Action Recognition
Wang, Peng
Cao, Yuanzhouhan
Shen, Chunhua
Liu, Lingqiao
Shen, Heng Tao
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (12) : 2613 - 2622
[48] Attention-Based Temporal Weighted Convolutional Neural Network for Action Recognition
Zang, Jinliang
Wang, Le
Liu, Ziyi
Zhang, Qilin
Niu, Zhenxing
Hua, Gang
Zheng, Nanning
ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2018, 2018, 519 : 97 - 108
[49] Multi-cue based 3D residual network for action recognition
Ming Zong
Ruili Wang
Zhe Chen
Maoli Wang
Xun Wang
Johan Potgieter
Neural Computing and Applications, 2021, 33 : 5167 - 5181
[50] Multi-cue based 3D residual network for action recognition
Zong, Ming
Wang, Ruili
Chen, Zhe
Wang, Maoli
Wang, Xun
Potgieter, Johan
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (10) : 5167 - 5181

← 1 2 3 4 5 →