Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor

被引:94
作者
Mueller, Franziska [1 ,2 ]
Mehta, Dushyant [1 ,2 ]
Sotnychenko, Oleksandr [1 ]
Sridhar, Srinath [1 ]
Casas, Dan [3 ]
Theobalt, Christian [1 ]
机构
[1] Max Planck Inst Informat, Saarbrucken, Germany
[2] Saarland Univ, Saarbrucken, Germany
[3] Univ Rey Juan Carlos, Mostoles, Spain
来源
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2017年
关键词
D O I
10.1109/ICCV.2017.131
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an approach for real-time, robust and accurate hand pose estimation from moving egocentric RGB-D cameras in cluttered real environments. Existing methods typically fail for hand-object interactions in cluttered scenes imaged from egocentric viewpoints-common for virtual or augmented reality applications. Our approach uses two subsequently applied Convolutional Neural Networks (CNNs) to localize the hand and regress 3D joint locations. Hand localization is achieved by using a CNN to estimate the 2D position of the hand center in the input, even in the presence of clutter and occlusions. The localized hand position, together with the corresponding input depth value, is used to generate a normalized cropped image that is fed into a second CNN to regress relative 3D hand joint locations in real time. For added accuracy, robustness and temporal stability, we refine the pose estimates using a kinematic pose tracking energy. To train the CNNs, we introduce a new photorealistic dataset that uses a merged reality approach to capture and synthesize large amounts of annotated data of natural hand interaction in cluttered scenes. Through quantitative and qualitative evaluation, we show that our method is robust to self-occlusion and occlusions by objects, particularly in moving egocentric perspectives.
引用
收藏
页码:1163 / 1172
页数:10
相关论文
共 42 条
[11]  
Keskin C, 2011, 2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), DOI 10.1109/ICCVW.2011.6130391
[12]   Scalable 3D Tracking of Multiple Interacting Objects [J].
Kyriazis, Nikolaos ;
Argyros, Antonis .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3430-3437
[13]  
Oberweger M., 2016, IEEE C COMP VIS PATT
[14]   Training a Feedback Loop for Hand Pose Estimation [J].
Oberweger, Markus ;
Wohlhart, Paul ;
Lepetit, Vincent .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :3316-3324
[15]  
Oikonomidis I, 2011, IEEE I CONF COMP VIS, P2088, DOI 10.1109/ICCV.2011.6126483
[16]  
Panteleris Paschalis, 2015, BMVC
[17]   Realtime and Robust Hand Tracking from Depth [J].
Qian, Chen ;
Sun, Xiao ;
Wei, Yichen ;
Tang, Xiaoou ;
Sun, Jian .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1106-1113
[18]   EgoCap: Egocentric Marker-less Motion Capture with Two Fisheye Cameras [J].
Rhodin, Helge ;
Shafiei, Mohammad ;
Richardt, Christian ;
Seidel, Hans-Peter ;
Casas, Dan ;
Schiele, Bernt ;
Insafutdinov, Eldar ;
Theobalt, Christian .
ACM TRANSACTIONS ON GRAPHICS, 2016, 35 (06)
[19]  
Rogez G, 2015, PROC CVPR IEEE, P4325, DOI 10.1109/CVPR.2015.7299061
[20]  
Rogez Gregory., 2014, Computer Vision-ECCV 2014 Workshops, P356