Dynamic Gesture Recognition Based on Two-Scale 3-D-ConvNeXt

被引:1
作者
Hao, Sida [1 ]
Fu, Min [1 ]
Liu, Xuefeng [2 ]
Zheng, Bing [1 ]
机构
[1] Ocean Univ China, Sanya Oceanog Inst, Sanya 572024, Peoples R China
[2] Qingdao Univ Sci & Technol, Coll Automat & Elect Engn, Qingdao 266100, Peoples R China
关键词
ConvNeXt; gesture recognition; human-computer interaction; spatiotemporal information;
D O I
10.1109/JSEN.2023.3324479
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As a straightforward method for human-machine interaction, gesture recognition is vital in many practical applications. However, effectively extracting spatiotemporal information from video is still a fundamental problem and designing an accurate and efficient network is a feasible solution. The ConvNeXt, renowned for its superior still image processing capabilities, is chosen as the basis of this work. Then, the network is extended to a 3-D pattern for dynamic data and a two-scale convolution kernel is introduced to focus on the hand region. Therefore, a novel two-scale 3-D-ConvNeXt network (TS3C-Net) is established. Furthermore, the Mixup, Cutmix data augmentation, and label smoothing regularization are also applied to enhance the performance further. The experiments show that the accuracy of the proposed TS3C-Net achieves 95.36%, 97.1%, and 87.55% on EgoGesture, Jester, and NVGesture datasets, respectively.
引用
收藏
页码:29227 / 29234
页数:8
相关论文
共 50 条
[1]   Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition with Multimodal Training [J].
Abavisani, Mahdi ;
Joze, Hamid Reza Vaezi ;
Patel, Vishal M. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1165-1174
[2]  
Abhishek KS, 2016, IEEE C ELEC DEVICES, P334, DOI 10.1109/EDSSC.2016.7785276
[3]  
Ba JL, 2016, arXiv
[4]  
Cao Y., 2023, IEEE Syst. J
[5]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[6]   Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition [J].
Chen, Chun-Fu ;
Panda, Rameswar ;
Ramakrishnan, Kandan ;
Feris, Rogerio ;
Cohn, John ;
Oliva, Aude ;
Fan, Quanfu .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :6161-6171
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]   Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs [J].
Ding, Xiaohan ;
Zhang, Xiangyu ;
Han, Jungong ;
Ding, Guiguang .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :11953-11965
[9]  
Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[10]   Learning Spatiotemporal Features with 3D Convolutional Networks [J].
Du Tran ;
Bourdev, Lubomir ;
Fergus, Rob ;
Torresani, Lorenzo ;
Paluri, Manohar .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497