Video supervised for 3D reconstruction from single image

被引:1
作者
Zhong, Yijie [1 ]
Sun, Zhengxing [1 ]
Luo, Shoutong [1 ]
Sun, Yunhan [1 ]
Wang, Yi [1 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Peoples R China
基金
国家高技术研究发展计划(863计划); 中国国家自然科学基金; 中国博士后科学基金;
关键词
Single image reconstruction; 3D reconstruction; Video supervision; Knowledge distillation; SHAPE;
D O I
10.1007/s11042-022-12459-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As a long-standing ill-posed problem, 3D reconstruction from a single image is an important research topic in computer vision. The information in a single image can represent an infinite number of possible three-dimensional shapes. To recover reasonable object geometry from a single image requires a correct shape prior. Thus, using what kind of supervision and how to make better use of training data are key issues. In this paper, we propose a framework for 3D reconstruction from single image with video supervision. On the one hand, we build a temporal network to generate fine 3D structure from video input benefiting from its temporal correlation. On the other hand, we introduce the knowledge distillation to transfer the shape prior extracted from the video. Also the mechanism ensures that the student network which for single image reconstruction can make full use of the knowledge learned from the teacher network which receives video input. In the inference phase, we can use the student network independently. Extensive experiments on ShapeNet show the superiority of our method.
引用
收藏
页码:15061 / 15083
页数:23
相关论文
共 68 条
[1]  
[Anonymous], 2006, Multiple View Geometry in Computer Vision
[2]  
[Anonymous], 2015, 3 INT C LEARN REPR I
[3]   Shape, Illumination, and Reflectance from Shading [J].
Barron, Jonathan T. ;
Malik, Jitendra .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (08) :1670-1687
[4]  
Broadhurst A, 2001, EIGHTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOL I, PROCEEDINGS, P388, DOI 10.1109/ICCV.2001.937544
[5]   Unsupervised 3D object recognition and reconstruction in unordered datasets [J].
Brown, M ;
Lowe, DG .
FIFTH INTERNATIONAL CONFERENCE ON 3-D DIGITAL IMAGING AND MODELING, PROCEEDINGS, 2005, :56-+
[6]   Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age [J].
Cadena, Cesar ;
Carlone, Luca ;
Carrillo, Henry ;
Latif, Yasir ;
Scaramuzza, Davide ;
Neira, Jose ;
Reid, Ian ;
Leonard, John J. .
IEEE TRANSACTIONS ON ROBOTICS, 2016, 32 (06) :1309-1332
[7]  
Chang Angel X, 2015, Technical Report
[8]   3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction [J].
Choy, Christopher B. ;
Xu, Danfei ;
Gwak, Jun Young ;
Chen, Kevin ;
Savarese, Silvio .
COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :628-644
[9]  
Curless B., 1996, Computer Graphics Proceedings. SIGGRAPH '96, P303, DOI 10.1145/237170.237269
[10]   Human Shape from Silhouettes using Generative HKS Descriptors and Cross-Modal Neural Networks [J].
Dibra, Endri ;
Jain, Himanshu ;
Oztireli, Cengiz ;
Ziegler, Remo ;
Gross, Markus .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5504-5514