JurassicWorld Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation

被引：0

作者：

Martin, Alexander ^{[1
]}

Zheng, Haitian ^{[1
]}

An, Jie ^{[1
]}

Luo, Jiebo ^{[1
]}

机构：

[1] Univ Rochester, Rochester, NY 14627 USA

来源：

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年

关键词：

image-to-image translation; large domain gap; stable diffusion;

D O I：

10.1145/3581783.3612708

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With a strong understanding of the target domain from natural language, we produce promising results in translating across large domain gaps and bringing skeletons back to life. In thiswork, we use text-guided latent diffusion models for zero-shot image-to-image translation (I2I) across large domain gaps (longI2I), where large amounts of new visual features and new geometry need to be generated to enter the target domain. Being able to perform translations across large domain gaps has a wide variety of real-world applications in criminology, astrology, environmental conservation, and paleontology. In this work, we introduce a new task Skull2Animal for translating between skulls and living animals. On this task, we find that unguided Generative Adversarial Networks (GANs) are not capable of translating across large domain gaps. Instead of these traditional I2I methods, we explore the use of guided diffusion and image editing models and provide a new benchmark model, Revive2I, capable of performing zero-shot I2I via text-prompting latent diffusion models. We find that guidance is necessary for longI2I because, to bridge the large domain gap, prior knowledge about the target domain is needed. In addition, we find that prompting provides the best and most scalable information about the target domain as classifier-guided diffusion models require retraining for specific use cases and lack stronger constraints on the target domain because of the wide variety of images they are trained on.

引用

页码：9320 / 9328

页数：9

共 39 条

[21] Unsupervised image-to-image translation via long-short cycle-consistent adversarial networks
Gang Wang
Haibo Shi
Yufei Chen
Bin Wu
Applied Intelligence, 2023, 53 : 17243 - 17259
[22] Zero-shot Generative Model Adaptation via Image-specific Prompt Learning
Guo, Jiayi
Wang, Chaofei
Wu, You
Zhang, Eric
Wang, Kai
Xu, Xingqian
Song, Shiji
Shi, Humphrey
Huang, Gao
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11494 - 11503
[23] Unsupervised image-to-image translation via long-short cycle-consistent adversarial networks
Wang, Gang
Shi, Haibo
Chen, Yufei
Wu, Bin
APPLIED INTELLIGENCE, 2023, 53 (14) : 17243 - 17259
[24] Zero-Shot Sketch-Based Image Retrieval via Graph Convolution Network
Zhang, Zhaolong
Zhang, Yuejie
Feng, Rui
Zhang, Tao
Fan, Weiguo
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12943 - 12950
[25] Few-Shot Face Sketch-to-Photo Synthesis via Global-Local Asymmetric Image-to-Image Translation
Li, Yongkang
Liang, Qifan
Han, Zhen
Mai, Wenjun
Wang, Zhongyuan
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (10)
[26] Zero-shot learning via visual feature enhancement and dual classifier learning for image recognition
Zhao, Peng
Xue, Huihui
Ji, Xia
Liu, Huiting
Han, Li
INFORMATION SCIENCES, 2023, 642
[27] Zero-Shot Image Recognition Algorithm via Semantic Auto-Encoder Combining Relation Network
Lin K.
Li H.
Bai J.
Li A.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2019, 32 (03): : 214 - 224
[28] Generalized Zero-Shot Image Classification via Partially-Shared Multi-Task Representation Learning
Wang, Gerui
Tang, Sheng
ELECTRONICS, 2023, 12 (09)
[29] MLTU: mixup long-tail unsupervised zero-shot image classification on vision-language models
Jia, Yunpeng
Ye, Xiufen
Mei, Xinkui
Liu, Yusong
Guo, Shuxiang
MULTIMEDIA SYSTEMS, 2024, 30 (03)
[30] Zero-shot sketch-based image retrieval via adaptive relation-aware metric learning
Liu, Yang
Dang, Yuhao
Gao, Xinbo
Han, Jungong
Shao, Ling
PATTERN RECOGNITION, 2024, 152

← 1 2 3 4 →