Multi-view Consistency as Supervisory Signal for Learning Shape and Pose Prediction

被引:97
作者
Tulsiani, Shubham [1 ]
Efros, Alexei A. [1 ]
Malik, Jitendra [1 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
来源
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR.2018.00306
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a framework for learning single-view shape and pose prediction without using direct supervision for either. Our approach allows leveraging multi-view observations from unknown poses as supervisory signal during training. Our proposed training setup enforces geometric consistency between the independently predicted shape and pose from two views of the same instance. We consequently learn to predict shape in an emergent canonical (view-agnostic) frame along with a corresponding pose predictor. We show empirical and qualitative results using the ShapeNet dataset and observe encouragingly competitive performance to previous techniques which rely on stronger forms of supervision. We also demonstrate the applicability of our framework in a realistic setting which is beyond the scope of existing techniques: using a training dataset comprised of online product images where the underlying shape and pose are unknown.
引用
收藏
页码:2897 / 2905
页数:9
相关论文
共 36 条
  • [31] Rezende D., 2016, NIPS
  • [32] Savinov N., 2016, CVPR
  • [33] Savinov N., 2015, CVPR
  • [34] Tulsiani Shubham, 2017, IEEE C COMP VIS PATT
  • [35] Ullman Shimon, 1979, P ROYAL SOC LONDON B
  • [36] Ulusoy A.O., 2015, 3DV