Language-driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates

被引：1

作者：

Shum, Ka Chun ^{[1
]}

Kim, Jaeyeon ^{[1
]}

Binh-Son Hua ^{[2
,4
]}

Duc Thanh Nguyen ^{[3
]}

Yeung, Sai-Kit ^{[1
]}

机构：

[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China

[2] VinAI, Hanoi, Vietnam

[3] Deakin Univ, Geelong, Vic, Australia

[4] Trinity Coll Dublin, Dublin, Ireland

来源：

2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024 | 2024年

关键词：

D O I：

10.1109/CVPR52733.2024.00495

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural radiance field (NeRF) is an emerging technique for 3D scene reconstruction and modeling. However, current NeRF-based methods are limited in the capabilities of adding or removing objects. This paper fills the aforementioned gap by proposing a new language-driven method for object manipulation in NeRFs through dataset updates. Specifically, to insert an object represented by a set of multi-view images into a background NeRF, we use a text-to-image diffusion model to blend the object into the given background across views. The generated images are then used to update the NeRF so that we can render view-consistent images of the object within the background. To ensure view consistency, we propose a dataset update strategy that prioritizes the radiance field training based on camera poses in a pose-ordered manner. We validate our method in two case studies: object insertion and object removal. Experimental results show that our method can generate photo-realistic results and achieves state-of-the-art performance in NeRF editing.

引用

页码：5176 / 5187

页数：12