Occlusion-Net: 2D/3D Occluded Keypoint Localization Using Graph Networks

被引:54
作者
Reddy, N. Dinesh [1 ]
Vo, Minh [1 ]
Narasimhan, Srinivasa G. [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
关键词
REPRESENTATION; MODEL;
D O I
10.1109/CVPR.2019.00750
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present Occlusion-Net(1), a framework to predict 2D and 3D locations of occluded keypoints for objects, in a largely self-supervised manner. We use an off-the-shelf detector as input (e.g. MaskRCNN [16]) that is trained only on visible key point annotations. This is the only supervision used in this work. A graph encoder network then explicitly classifies invisible edges and a graph decoder network corrects the occluded keypoint locations from the initial detector. Central to this work is a trifocal tensor loss that provides indirect self-supervision for occluded keypoint locations that are visible in other views of the object. The 2D keypoints are then passed into a 3D graph network that estimates the 3D shape and camera pose using the self-supervised reprojection loss. At test time, Occlusion-Net successfully localizes keypoints in a single view under a diverse set of occlusion settings. We validate our approach on synthetic CAD data as well as a large image set capturing vehicles at many busy city intersections. As an interesting aside, we compare the accuracy of human labels of invisible keypoints against those predicted by the trifocal tensor
引用
收藏
页码:7318 / 7327
页数:10
相关论文
共 54 条
[1]  
[Anonymous], 2015, CVPR
[2]  
[Anonymous], 2018, CVPR
[3]  
[Anonymous], 2008, CVPR
[4]  
[Anonymous], IEEE C COMP VIS PATT
[5]  
[Anonymous], 2015, P COMP VIS PATT REC
[6]  
[Anonymous], EUR C COMP VISS ECCV
[7]  
[Anonymous], 2013, TPAMI
[8]  
[Anonymous], 2017, ARXIV170307570
[9]  
[Anonymous], 2016, LECT NOTES COMPUT SC, DOI DOI 10.1007/978-3-319-46484-8_29
[10]  
[Anonymous], 2015, CORR