An attention-based RGBD dual-branch gesture recognition network

被引:0
作者
Chen, Bo [1 ,2 ]
Xie, Pengwei [1 ,2 ]
Hao, Nan [1 ,2 ]
机构
[1] Beijing Inst Technol, Sch Automat, Beijing 100081, Peoples R China
[2] Key Lab Complex Syst Intelligent Control & Decis, Beijing 100081, Peoples R China
来源
2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC) | 2021年
关键词
Gesture Recognition; RGBD feature fusion; attention mechanism; real-time;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this work, we use a hierarchical architecture based on detector-classifier for gesture recognition task. During the operation of the architecture, the detector,which is essentially the switch of the classifier,is always running. When the output of the detector is true, then the classifier is activated and returns a classification label for the input video stream. Our work focuses on the improvement of detectors and classifiers. In the detector, we introduce an attention mechanism to guide the network to focus on the space and channel where the gesture is located. For the classifier, based on the RGB information stream, we use an independent branch to extract the features of the depth stream, and finally merge the two branches. Because gestures move in a three-dimensional space, depth information can make up for the lack of RGB information. Experiments show that on the Egogesture test set, our detector achieves 98.86% accuracy on RGB input, while the classifier achieves 93.85% accuracy. At the same time, our gesture recognition architecture can fully meet the real-time requirements.
引用
收藏
页码:8022 / 8027
页数:6
相关论文
共 23 条
[1]  
Aaisha P S, 2015, MICROCONTROLLER BASE
[2]  
[Anonymous], 2014, Learning spatiotemporal features with 3d convolutional networks
[3]  
Chang Y, 2012, DYNAMIC GESTURE RECO
[4]  
Cote M, 2006, IEEE INT WORKSH IM S
[5]  
Courtney PG, 2015, IEEE COMP SEMICON
[6]   Long-Term Recurrent Convolutional Networks for Visual Recognition and Description [J].
Donahue, Jeff ;
Hendricks, Lisa Anne ;
Rohrbach, Marcus ;
Venugopalan, Subhashini ;
Guadarrama, Sergio ;
Saenko, Kate ;
Darrell, Trevor .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (04) :677-691
[7]  
Du T., 2017, CONVNET ARCHITECTURE
[8]  
Graves A., 2006, INT C MACH LEARN
[9]   Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? [J].
Hara, Kensho ;
Kataoka, Hirokatsu ;
Satoh, Yutaka .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6546-6555
[10]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778