A General Differentiable Mesh Renderer for Image-Based 3D Reasoning

被引：9

作者：

Liu, Shichen ^{[1
,2
]}

Li, Tianye ^{[1
,2
]}

Chen, Weikai ^{[3
]}

Li, Hao ^{[4
]}

机构：

[1] Univ Southern Calif, Dept Comp Sci, Los Angeles, CA 90007 USA

[2] USC Inst Creat Technol, Los Angeles, CA 90094 USA

[3] Tencent Amer, Los Angeles, CA 94306 USA

[4] Pinscreen, Los Angeles, CA 90025 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2022年 / 44卷 / 01期

关键词：

Vision and scene understanding; modeling and recovery of physical attributes; perceptual reasoning; computer graphics; picture/image generation; APPEARANCE;

D O I：

10.1109/TPAMI.2020.3007759

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Rendering bridges the gap between 2D vision and 3D scenes by simulating the physical process of image formation. By inverting such renderer, one can think of a learning approach to infer 3D information from 2D images. However, standard graphics renderers involve a fundamental step called rasterization, which prevents rendering to be differentiable. Unlike the state-of-the-art differentiable renderers (Kato et al. 2018 and Loper 2018), which only approximate the rendering gradient in the backpropagation, we propose a natually differentiable rendering framework that is able to (1) directly render colorized mesh using differentiable functions and (2) back-propagate efficient supervisions to mesh vertices and their attributes from various forms of image representations. The key to our framework is a novel formulation that views rendering as an aggregation function that fuses the probabilistic contributions of all mesh triangles with respect to the rendered pixels. Such formulation enables our framework to flow gradients to the occluded and distant vertices, which cannot be achieved by the previous state-of-the-arts. We show that by using the proposed renderer, one can achieve significant improvement in 3D unsupervised single-view reconstruction both qualitatively and quantitatively. Experiments also demonstrate that our approach can handle the challenging tasks in image-based shape fitting, which remain nontrivial to existing differentiable renders.

引用

页码：50 / 62

页数：13

共 66 条

[1] [Anonymous], 2013, NIPS
[2] Bavoil L., 2008, Order Independent Transparency with Dual Depth Peeling, P1
[3] A morphable model for the synthesis of 3D faces
Blanz, V
Vetter, T
[J]. SIGGRAPH 99 CONFERENCE PROCEEDINGS, 1999, : 187 - 194
[4] Face recognition based on fitting a 3D morphable model
Blanz, V
Vetter, T
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2003, 25 (09) : 1063 - 1074
[5] Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image
Bogo, Federica
Kanazawa, Angjoo
Lassner, Christoph
Gehler, Peter
Romero, Javier
Black, Michael J.
[J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 561 - 578
[6] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Cao, Zhe
Simon, Tomas
Wei, Shih-En
Sheikh, Yaser
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1302 - 1310
[7] Carpenter L., 1984, Computers & Graphics, V18, P103
[8] Chang A.X., 2015, ARXIV
[9] 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction
Choy, Christopher B.
Xu, Danfei
Gwak, Jun Young
Chen, Kevin
Savarese, Silvio
[J]. COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 : 628 - 644
[10] Active appearance models
Cootes, TF
Edwards, GJ
Taylor, CJ
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (06) : 681 - 685

← 1 2 3 4 5 6 7 →