Multi3D: 3D-aware multimodal image synthesis

被引:1
|
作者
Zhou, Wenyang [1 ]
Yuan, Lu [2 ]
Mu, Taijiang [1 ]
机构
[1] Tsinghua Univ, BNRist, Beijing 100084, Peoples R China
[2] Stanford Univ, Comp Sci Dept, Stanford, CA 94305 USA
基金
中国国家自然科学基金;
关键词
generate adversarial networks (GANs); neural radiation field (NeRF); 3D-aware image synthesis; controllable generation; MANIPULATION;
D O I
10.1007/s41095-024-0422-4
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
3D-aware image synthesis has attained high quality and robust 3D consistency. Existing 3D controllable generative models are designed to synthesize 3D-aware images through a single modality, such as 2D segmentation or sketches, but lack the ability to finely control generated content, such as texture and age. In pursuit of enhancing user-guided controllability, we propose Multi3D, a 3D-aware controllable image synthesis model that supports multi-modal input. Our model can govern the geometry of the generated image using a 2D label map, such as a segmentation or sketch map, while concurrently regulating the appearance of the generated image through a textual description. To demonstrate the effectiveness of our method, we have conducted experiments on multiple datasets, including CelebAMask-HQ, AFHQ-cat, and shapenet-car. Qualitative and quantitative evaluations show that our method outperforms existing state-of-the-art methods.
引用
收藏
页码:1205 / 1217
页数:13
相关论文
共 50 条
  • [1] 3D-aware Conditional Image Synthesis
    Deng, Kangle
    Yang, Gengshan
    Ramanan, Deva
    Zhu, Jun-Yan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 4434 - 4445
  • [2] BallGAN: 3D-aware Image Synthesis with a Spherical Background
    Shin, Minjung
    Seo, Yunji
    Bae, Jeongmin
    Choi, Young Sun
    Kim, Hyunsu
    Byun, Hyeran
    Uh, Youngjung
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 7234 - 7245
  • [3] A Survey on Deep Generative 3D-aware Image Synthesis
    Xia, Weihao
    Xue, Jing-Hao
    ACM COMPUTING SURVEYS, 2024, 56 (04)
  • [4] Learning 3D-aware Image Synthesis with Unknown Pose Distribution
    Shi, Zifan
    Shen, Yujun
    Xu, Yinghao
    Peng, Sida
    Liao, Yiyi
    Guo, Sheng
    Chen, Qifeng
    Yeung, Dit-Yan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 13062 - 13071
  • [5] Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator
    Shi, Zifan
    Xu, Yinghao
    Shen, Yujun
    Zhao, Deli
    Chen, Qifeng
    Yeung, Dit-Yan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [6] Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis
    Zhang, Xuanmeng
    Zheng, Zhedong
    Gao, Daiheng
    Zhang, Bang
    Pan, Pan
    Yang, Yi
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18429 - 18438
  • [7] Quantitative Manipulation of Custom Attributes on 3D-Aware Image Synthesis
    Do, Hoseok
    Yoo, EunKyung
    Kim, Taehyeong
    Lee, Chul
    Choi, Tin Young
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 8529 - 8538
  • [8] 3D Congealing: 3D-Aware Image Alignment in the Wild
    Zhang, Yunzhi
    Li, Zizhang
    Raj, Amit
    Engelhardt, Andreas
    Li, Yuanzhen
    Hou, Tingbo
    Wu, Jiajun
    Jampani, Varun
    COMPUTER VISION-ECCV 2024, PT I, 2025, 15059 : 387 - 404
  • [9] GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis
    Schwarz, Katja
    Liao, Yiyi
    Niemeyer, Michael
    Geiger, Andreas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [10] 3D-Aware Multi-Class Image-to-Image Translation with NeRFs
    Li, Senmao
    van de Weijer, Joost
    Wang, Yaxing
    Khan, Fahad Shahbaz
    Liu, Meiqin
    Yang, Jian
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 12652 - 12662