JurassicWorld Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation

被引:0
|
作者
Martin, Alexander [1 ]
Zheng, Haitian [1 ]
An, Jie [1 ]
Luo, Jiebo [1 ]
机构
[1] Univ Rochester, Rochester, NY 14627 USA
关键词
image-to-image translation; large domain gap; stable diffusion;
D O I
10.1145/3581783.3612708
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With a strong understanding of the target domain from natural language, we produce promising results in translating across large domain gaps and bringing skeletons back to life. In thiswork, we use text-guided latent diffusion models for zero-shot image-to-image translation (I2I) across large domain gaps (longI2I), where large amounts of new visual features and new geometry need to be generated to enter the target domain. Being able to perform translations across large domain gaps has a wide variety of real-world applications in criminology, astrology, environmental conservation, and paleontology. In this work, we introduce a new task Skull2Animal for translating between skulls and living animals. On this task, we find that unguided Generative Adversarial Networks (GANs) are not capable of translating across large domain gaps. Instead of these traditional I2I methods, we explore the use of guided diffusion and image editing models and provide a new benchmark model, Revive2I, capable of performing zero-shot I2I via text-prompting latent diffusion models. We find that guidance is necessary for longI2I because, to bridge the large domain gap, prior knowledge about the target domain is needed. In addition, we find that prompting provides the best and most scalable information about the target domain as classifier-guided diffusion models require retraining for specific use cases and lack stronger constraints on the target domain because of the wide variety of images they are trained on.
引用
收藏
页码:9320 / 9328
页数:9
相关论文
共 39 条
  • [1] Zero-shot Image-to-Image Translation
    Parmar, Gaurav
    Singh, Krishna Kumar
    Zhang, Richard
    Li, Yijun
    Lu, Jingwan
    Zhu, Jun-Yan
    PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,
  • [2] Zero-shot unsupervised image-to-image translation via exploiting semantic attributes
    Chen, Yuanqi
    Yu, Xiaoming
    Liu, Shan
    Gao, Wei
    Li, Ge
    Image and Vision Computing, 2022, 124
  • [3] Zero-shot unsupervised image-to-image translation via exploiting semantic attributes
    Chen, Yuanqi
    Yu, Xiaoming
    Liu, Shan
    Gao, Wei
    Li, Ge
    IMAGE AND VISION COMPUTING, 2022, 124
  • [4] ZstGAN: An adversarial approach for Unsupervised Zero-Shot Image-to-image Translation
    Lin, Jianxin
    Xia, Yingce
    Liu, Sen
    Zhao, Shuxin
    Chen, Zhibo
    NEUROCOMPUTING, 2021, 461 : 327 - 335
  • [5] Zero-Shot Medical Image Translation via Frequency-Guided Diffusion Models
    Li, Yunxiang
    Shao, Hua-Chieh
    Liang, Xiao
    Chen, Liyuan
    Li, Ruiqi
    Jiang, Steve
    Wang, Jing
    Zhang, You
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2024, 43 (03) : 980 - 993
  • [6] Inductive Zero-Shot Image Annotation via Embedding Graph
    Wang, Fangxin
    Liu, Jie
    Zhang, Shuwu
    Zhang, Guixuan
    Li, Yuejun
    Yuan, Fei
    IEEE ACCESS, 2019, 7 : 107816 - 107830
  • [7] Zero-shot Pose Estimation Using Image Translation to Maintain Object Pose
    Fujita K.
    Tasaki T.
    IEEJ Transactions on Electronics, Information and Systems, 2023, 143 (12) : 1113 - 1122
  • [8] Zero-shot image classification via Visual–Semantic Feature Decoupling
    Xin Sun
    Yu Tian
    Haojie Li
    Multimedia Systems, 2024, 30
  • [9] Boosting Zero-Shot Image Classification via Pairwise Relationship Learning
    Li, Hanhui
    Wu, Hefeng
    Lin, Shujin
    Lin, Liang
    Luo, Xiaonan
    Izquierdo, Ebroul
    COMPUTER VISION - ACCV 2016, PT I, 2017, 10111 : 85 - 99
  • [10] SIMSAM: ZERO-SHOT MEDICAL IMAGE SEGMENTATION VIA SIMULATED INTERACTION
    Towle, Benjamin
    Chen, Xin
    Zhou, Ke
    IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI 2024, 2024,