Multi-Scale Feature-Based Spatiotemporal Pyramid Network for Hand Gesture Recognition

被引:0
作者
Cao, Zongjing [1 ]
Li, Yan [1 ]
Shin, Byeong-Seok [1 ]
机构
[1] Inha Univ, Dept Elect & Comp Engn, Incheon, South Korea
基金
新加坡国家研究基金会;
关键词
Deep Learning; Hand Gesture Recognition; Pyramid Network; Spatiotemporal Feature;
D O I
10.22967/HCIS.2022.12.046
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Effectively capturing the spatiotemporal features of hand gestures from sequence data is crucial for gesture recognition. Existing work has effectively obtained motion features from between neighboring frames through well-designed temporal modeling networks; however, less attention has been paid to the spatial information contained in each frame. These approaches ignore the implicit complementary advantages of multi-scale appearance representations, which are essential to gesture recognition. We propose a multi-scale, feature-based spatiotemporal pyramid network for hand gesture recognition. It has a top-down, lateral-connection architecture designed to fuse spatial and temporal features from multiple scales in each layer. The network first outputs a coarse feature in a feedforward pass and then refines this feature in the top-down pass using features from successive lower layers. Similar to skip connections, our approach uses features from each layer of the network, but does not attempt to output independent predictions in each layer. Furthermore, we introduce a spatiotemporal pyramid module formed by stacking multiple successive refinement modules to fuse the multi -scale spatial feature output from each layer. We evaluate the proposed model with two publicly available benchmark hand gesture datasets. The model achieved accuracies of 85.1% and 95.4% for depth modality in the NVGesture and EgoGesture datasets, respectively. The comparison results show that the proposed hand gesture recognition method outperforms existing state-of-the-art methods.
引用
收藏
页数:14
相关论文
共 50 条
[21]   A multi-scale descriptor for real time RGB-D hand gesture recognition [J].
Huang, Yao ;
Yang, Jianyu .
PATTERN RECOGNITION LETTERS, 2021, 144 :97-104
[22]   Convolutional neural network with spatial pyramid pooling for hand gesture recognition [J].
Tan, Yong Soon ;
Lim, Kian Ming ;
Tee, Connie ;
Lee, Chin Poo ;
Low, Cheng Yaw .
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (10) :5339-5351
[23]   Convolutional neural network with spatial pyramid pooling for hand gesture recognition [J].
Yong Soon Tan ;
Kian Ming Lim ;
Connie Tee ;
Chin Poo Lee ;
Cheng Yaw Low .
Neural Computing and Applications, 2021, 33 :5339-5351
[24]   Feature-Based Hand Gesture Recognition Using Two-Antenna Doppler Radar System [J].
Wang, Pengcheng ;
Liang, Tingxuan ;
Xu, Hongtao .
2022 IEEE MTT-S INTERNATIONAL WIRELESS SYMPOSIUM, IWS, 2022,
[25]   Pyramid-attention based multi-scale feature fusion network for multispectral pan-sharpening [J].
Chi, Yang ;
Li, Jinjiang ;
Fan, Hui .
APPLIED INTELLIGENCE, 2022, 52 (05) :5353-5365
[26]   Pyramid-attention based multi-scale feature fusion network for multispectral pan-sharpening [J].
Yang Chi ;
Jinjiang Li ;
Hui Fan .
Applied Intelligence, 2022, 52 :5353-5365
[27]   ssFPN: Scale Sequence (S2) Feature-Based Feature Pyramid Network for Object Detection [J].
Park, Hye-Jin ;
Kang, Ji-Woo ;
Kim, Byung-Gyu .
SENSORS, 2023, 23 (09)
[28]   Independently Trained Multi-Scale Registration Network Based on Image Pyramid [J].
Chang, Qing ;
Wang, Yaqi ;
Zhang, Jieming .
JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2024, 37 (04) :1557-1566
[29]   Multi-scale joint feature network for micro-expression recognition [J].
Xinyu Li ;
Guangshun Wei ;
Jie Wang ;
Yuanfeng Zhou .
Computational Visual Media, 2021, 7 :407-417
[30]   Multi-scale joint feature network for micro-expression recognition [J].
Li, Xinyu ;
Wei, Guangshun ;
Wang, Jie ;
Zhou, Yuanfeng .
COMPUTATIONAL VISUAL MEDIA, 2021, 7 (03) :407-417