JurassicWorld Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation

被引:0
|
作者
Martin, Alexander [1 ]
Zheng, Haitian [1 ]
An, Jie [1 ]
Luo, Jiebo [1 ]
机构
[1] Univ Rochester, Rochester, NY 14627 USA
关键词
image-to-image translation; large domain gap; stable diffusion;
D O I
10.1145/3581783.3612708
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With a strong understanding of the target domain from natural language, we produce promising results in translating across large domain gaps and bringing skeletons back to life. In thiswork, we use text-guided latent diffusion models for zero-shot image-to-image translation (I2I) across large domain gaps (longI2I), where large amounts of new visual features and new geometry need to be generated to enter the target domain. Being able to perform translations across large domain gaps has a wide variety of real-world applications in criminology, astrology, environmental conservation, and paleontology. In this work, we introduce a new task Skull2Animal for translating between skulls and living animals. On this task, we find that unguided Generative Adversarial Networks (GANs) are not capable of translating across large domain gaps. Instead of these traditional I2I methods, we explore the use of guided diffusion and image editing models and provide a new benchmark model, Revive2I, capable of performing zero-shot I2I via text-prompting latent diffusion models. We find that guidance is necessary for longI2I because, to bridge the large domain gap, prior knowledge about the target domain is needed. In addition, we find that prompting provides the best and most scalable information about the target domain as classifier-guided diffusion models require retraining for specific use cases and lack stronger constraints on the target domain because of the wide variety of images they are trained on.
引用
收藏
页码:9320 / 9328
页数:9
相关论文
共 39 条
  • [21] Unsupervised image-to-image translation via long-short cycle-consistent adversarial networks
    Gang Wang
    Haibo Shi
    Yufei Chen
    Bin Wu
    Applied Intelligence, 2023, 53 : 17243 - 17259
  • [22] Zero-shot Generative Model Adaptation via Image-specific Prompt Learning
    Guo, Jiayi
    Wang, Chaofei
    Wu, You
    Zhang, Eric
    Wang, Kai
    Xu, Xingqian
    Song, Shiji
    Shi, Humphrey
    Huang, Gao
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11494 - 11503
  • [23] Unsupervised image-to-image translation via long-short cycle-consistent adversarial networks
    Wang, Gang
    Shi, Haibo
    Chen, Yufei
    Wu, Bin
    APPLIED INTELLIGENCE, 2023, 53 (14) : 17243 - 17259
  • [24] Zero-Shot Sketch-Based Image Retrieval via Graph Convolution Network
    Zhang, Zhaolong
    Zhang, Yuejie
    Feng, Rui
    Zhang, Tao
    Fan, Weiguo
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12943 - 12950
  • [25] Few-Shot Face Sketch-to-Photo Synthesis via Global-Local Asymmetric Image-to-Image Translation
    Li, Yongkang
    Liang, Qifan
    Han, Zhen
    Mai, Wenjun
    Wang, Zhongyuan
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (10)
  • [26] Zero-shot learning via visual feature enhancement and dual classifier learning for image recognition
    Zhao, Peng
    Xue, Huihui
    Ji, Xia
    Liu, Huiting
    Han, Li
    INFORMATION SCIENCES, 2023, 642
  • [27] Zero-Shot Image Recognition Algorithm via Semantic Auto-Encoder Combining Relation Network
    Lin K.
    Li H.
    Bai J.
    Li A.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2019, 32 (03): : 214 - 224
  • [28] Generalized Zero-Shot Image Classification via Partially-Shared Multi-Task Representation Learning
    Wang, Gerui
    Tang, Sheng
    ELECTRONICS, 2023, 12 (09)
  • [29] MLTU: mixup long-tail unsupervised zero-shot image classification on vision-language models
    Jia, Yunpeng
    Ye, Xiufen
    Mei, Xinkui
    Liu, Yusong
    Guo, Shuxiang
    MULTIMEDIA SYSTEMS, 2024, 30 (03)
  • [30] Zero-shot sketch-based image retrieval via adaptive relation-aware metric learning
    Liu, Yang
    Dang, Yuhao
    Gao, Xinbo
    Han, Jungong
    Shao, Ling
    PATTERN RECOGNITION, 2024, 152