Text2Tex: Text-driven Texture Synthesis via Diffusion Models

被引:14
作者
Chen, Dave Zhenyu [1 ]
Siddiqui, Yawar [1 ]
Lee, Hsin-Ying [2 ]
Tulyakov, Sergey [2 ]
Niessner, Matthias [1 ]
机构
[1] Tech Univ Munich, Munich, Germany
[2] Snap Res, Santa Monica, CA 90405 USA
来源
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年
关键词
D O I
10.1109/ICCV51070.2023.01701
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present Text2Tex, a novel method for generating high-quality textures for 3D meshes from the given text prompts. Our method incorporates inpainting into a pre-trained depth-aware image diffusion model to progressively synthesize high resolution partial textures from multiple viewpoints. To avoid accumulating inconsistent and stretched artifacts across views, we dynamically segment the rendered view into a generation mask, which represents the generation status of each visible texel. This partitioned view representation guides the depth-aware inpainting model to generate and update partial textures for the corresponding regions. Furthermore, we propose an automatic view sequence generation scheme to determine the next best view for updating the partial texture. Extensive experiments demonstrate that our method significantly outperforms the existing text-driven approaches and GAN-based methods.
引用
收藏
页码:18512 / 18522
页数:11
相关论文
共 50 条
  • [31] Text-driven Speech Animation with Emotion Control
    Chae, Wonseok
    Kim, Yejin
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2020, 14 (08): : 3473 - 3487
  • [32] SceneScape: Text-Driven Consistent Scene Generation
    Fridman, Rafail
    Abecasis, Amit
    Kasten, Yoni
    Dekel, Tali
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [33] Text2City: One-Stage Text-Driven Urban Layout Regeneration
    Qin, Yiming
    Zhao, Nanxuan
    Sheng, Bin
    Lau, Rynson W. H.
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4578 - 4586
  • [34] Multi-channel correlated diffusion for text-driven artistic style transfer
    Jiang, Guoquan
    Wang, Canyu
    Huo, Zhanqiang
    Xu, Huan
    VISUAL COMPUTER, 2025,
  • [35] GUESS: GradUally Enriching SyntheSis for Text-Driven Human Motion Generation
    Gao, Xuehao
    Yang, Yang
    Xie, Zhenyu
    Du, Shaoyi
    Sun, Zhongqian
    Wu, Yang
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (12) : 7518 - 7530
  • [36] StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
    Patashnik, Or
    Wu, Zongze
    Shechtman, Eli
    Cohen-Or, Daniel
    Lischinski, Dani
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2065 - 2074
  • [37] Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model
    Wang, Yin
    Leng, Zhiying
    Li, Frederick W. B.
    Wu, Shun-Cheng
    Liang, Xiaohui
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21978 - 21987
  • [38] DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation
    Lyu, Yueming
    Lin, Tianwei
    Li, Fu
    He, Dongliang
    Dong, Jing
    Tan, Tieniu
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6894 - 6903
  • [39] TELL YOUR STORY: TEXT-DRIVEN FACE VIDEO SYNTHESIS WITH HIGH DIVERSITY VIA ADVERSARIAL LEARNING
    Hou, Xia
    Sun, Meng
    Song, Wenfeng
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 515 - 519
  • [40] Text2VRScene: Exploring the Framework of Automated Text-driven Generation System for VR Experience
    Yin, Zhizhuo
    Wang, Yuyang
    Papatheodorou, Theodoros
    Hui, Pan
    2024 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES, VR 2024, 2024, : 701 - 711