Zero-Shot Text-Guided Object Generation with Dream Fields

被引：231

作者：

Jain, Ajay ^{[1
,2
]}

Mildenhall, Ben ^{[2
]}

Barron, Jonathan T. ^{[2
]}

Abbeel, Pieter ^{[1
]}

Poole, Ben ^{[2
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

[2] Google Res, Mountain View, CA 94043 USA

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

关键词：

D O I：

10.1109/CVPR52688.2022.00094

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We combine neural rendering with multi-modal image and text representations to synthesize diverse 3D objects solely from natural language descriptions. Our method, Dream Fields, can generate the geometry and color of a wide range of objects without 3D supervision. Due to the scarcity of diverse, captioned 3D data, prior methods only generate objects from a handful of categories, such as ShapeNet. Instead, we guide generation with image-text models pre-trained on large datasets of captioned images from the web. Our method optimizes a Neural Radiance Field from many camera views so that rendered images score highly with a target caption according to a pre-trained CLIP model. To improve fidelity and visual quality, we introduce simple geometric priors, including sparsity-inducing transmittance regularization, scene bounds, and new MLP architectures. In experiments, Dream Fields produce realistic, multi-view consistent object geometry and color from a variety of natural language captions.

引用

页码：857 / 866

页数：10

共 65 条

[1]

[Anonymous], 2017, ICML

[2]

[Anonymous], 2021, CVPR, DOI DOI 10.1109/CVPR46437.2021.01101

[3]

[Anonymous], 2014, EMNLP

[4]

[Anonymous], 2015, SIGGRAPH ASIA, DOI DOI 10.1145/2816795.2818013

[5]

[Anonymous], 2019, ICML

[6]

[Anonymous], 2016, NEURIPS

[7]

Barron Jonathan T., 2015, IEEE T PATTERN ANAL

[8]

Barron Jonathan T., 2021, ICCV

[9]

Cai Ruojin, 2020, ECCV

[10]

Chang Angel X., 2015, arXiv

← 1 2 3 4 5 6 7 →