Multistage Spatial Attention-Based Neural Network for Hand Gesture Recognition

被引:29
作者
Miah, Abu Saleh Musa [1 ]
Hasan, Md. Al Mehedi [1 ]
Shin, Jungpil [1 ]
Okuyama, Yuichi [1 ]
Tomioka, Yoichi [1 ]
机构
[1] Univ Aizu, Sch Comp Sci & Engn, Fukushima 9658580, Japan
关键词
kinematic sensor; multistage attention neural network; CNN; attention model; sign language recognition; gesture recognition;
D O I
10.3390/computers12010013
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The definition of human-computer interaction (HCI) has changed in the current year because people are interested in their various ergonomic devices ways. Many researchers have been working to develop a hand gesture recognition system with a kinetic sensor-based dataset, but their performance accuracy is not satisfactory. In our work, we proposed a multistage spatial attention-based neural network for hand gesture recognition to overcome the challenges. We included three stages in the proposed model where each stage is inherited the CNN; where we first apply a feature extractor and a spatial attention module by using self-attention from the original dataset and then multiply the feature vector with the attention map to highlight effective features of the dataset. Then, we explored features concatenated with the original dataset for obtaining modality feature embedding. In the same way, we generated a feature vector and attention map in the second stage with the feature extraction architecture and self-attention technique. After multiplying the attention map and features, we produced the final feature, which feeds into the third stage, a classification module to predict the label of the correspondent hand gesture. Our model achieved 99.67%, 99.75%, and 99.46% accuracy for the senz3D, Kinematic, and NTU datasets.
引用
收藏
页数:11
相关论文
共 50 条
[1]  
Abadi Martin, 2016, arXiv
[2]  
[Anonymous], 2005, ICGST INT J GRAPH VI
[3]  
[Anonymous], 2007, P IEEE C INTERACTIVE
[4]  
[Anonymous], 2009, Int. J. Inf. Technol. Knowl. Manag
[5]  
Biasotti S., EXPLOITING SILHOUETT, DOI [10.2312/stag20151288/015-023.pdf, DOI 10.2312/STAG20151288/015-023.PDF]
[6]  
Dewaele G, 2004, LECT NOTES COMPUT SC, V3021, P495
[7]  
Doucet A., 2001, Sequential Monte Carlo methods in practice, V1
[8]  
Dozat T., 2016, ICLR WORKSH
[9]   Vision-based hand pose estimation: A review [J].
Erol, Ali ;
Bebis, George ;
Nicolescu, Mircea ;
Boyle, Richard D. ;
Twombly, Xander .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2007, 108 (1-2) :52-73
[10]  
Glorot X., 2010, PROC 13 INT C ARTIF, P249