Hand Pose Estimation in the Task of Egocentric Actions

被引:1
作者
Hruz, Marek [1 ,2 ]
Kanis, Jakub [2 ]
Krnoul, Zdenek [1 ]
机构
[1] Univ West Bohemia Pilsen, Fac Appl Sci, Dept Cybernet, Plzen 30614, Czech Republic
[2] Univ West Bohemia Pilsen, Fac Appl Sci, New Technol Informat Soc, Plzen 30100, Czech Republic
关键词
Three-dimensional displays; Pose estimation; Task analysis; Two dimensional displays; Solid modeling; Prediction algorithms; Location awareness; 3D convolutional neural network; egocentric; hand pose; TruncatedSVD; volumetric data; REGRESSION;
D O I
10.1109/ACCESS.2021.3050624
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article we tackle the problem of hand pose estimation when the hand is interacting with various objects from egocentric viewpoint. This entails a frequent occlusion of parts of the hand by the object and also self-occlusions of the hand. We use a Voxel-to-Voxel approach to obtain hypotheses of the hand joint locations, ensemble the hypotheses and use several post-processing strategies to improve on the results. We utilize models of prior hand pose in the form of Truncated Singular Value Decomposition (SVD) and the temporal context to produce refined hand joint locations. We present an ablation study of the methods to show the influence of individual features of the post-processing. With our method we were able to constitute state-of-the-art results on the HANDS19 Challenge: Task 2 - Depth-Based 3D Hand Pose Estimation while Interacting with Objects, with precision on unseen test data of 33.09 mm.
引用
收藏
页码:10533 / 10547
页数:15
相关论文
共 67 条
[1]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[2]   Less Is More: A Comprehensive Framework for the Number of Components of Ensemble Classifiers [J].
Bonab, Hamed ;
Can, Fazli .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (09) :2735-2745
[3]  
Che YL, 2019, INT CONF ACOUST SPEE, P2222, DOI [10.1109/icassp.2019.8682382, 10.1109/ICASSP.2019.8682382]
[4]  
Chen TY, 2016, INT C PATT RECOG, P615, DOI 10.1109/ICPR.2016.7899702
[5]   SHPR-Net: Deep Semantic Hand Pose Regression From Point Clouds [J].
Chen, Xinghao ;
Wang, Guijin ;
Zhang, Cairong ;
Kim, Tae-Kyun ;
Ji, Xiangyang .
IEEE ACCESS, 2018, 6 :43425-43439
[6]   Learning Hand Articulations by Hallucinating Heat Distribution [J].
Choi, Chiho ;
Kim, Sangpil ;
Ramani, Karthik .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3123-3132
[7]   A Collaborative Filtering Approach to Real-Time Hand Pose Estimation [J].
Choi, Chiho ;
Sinha, Ayan ;
Choi, Joon Hee ;
Jang, Sujin ;
Ramani, Karthik .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2336-2344
[8]   Scaling Egocentric Vision: The EPIC-KITCHENS Dataset [J].
Damen, Dima ;
Doughty, Hazel ;
Farinella, Giovanni Maria ;
Fidler, Sanja ;
Furnari, Antonino ;
Kazakos, Evangelos ;
Moltisanti, Davide ;
Munro, Jonathan ;
Perrett, Toby ;
Price, Will ;
Wray, Michael .
COMPUTER VISION - ECCV 2018, PT IV, 2018, 11208 :753-771
[9]   CrosslnfoNet: Multi-Task Information Sharing Based Hand Pose Estimation [J].
Du, Kuo ;
Lin, Xiangbo ;
Sun, Yi ;
Ma, Xiaohong .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9888-9897
[10]  
ElKoura G., 2003, ACM SIGGRAPH/Eurographics Symposium on Computer Animation, P110