Uni-paint: A Unified Framework for Multimodal Image Inpainting with Pretrained Diffusion Model

被引:18
|
作者
Yang, Shiyuan [1 ]
Chen, Xiaodong [2 ]
Liao, Jing [1 ]
机构
[1] City Univ Hong Kong, Hong Kong, Peoples R China
[2] Tianjin Univ, Tianjin, Peoples R China
关键词
Image Inpainting; Diffusion Model; Multimodal;
D O I
10.1145/3581783.3612200
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, text-to-image denoising diffusion probabilistic models (DDPMs) have demonstrated impressive image generation capabilities and have also been successfully applied to image inpainting. However, in practice, users often require more control over the inpainting process beyond textual guidance, especially when they want to composite objects with customized appearance, color, shape, and layout. Unfortunately, existing diffusion-based inpainting methods are limited to single-modal guidance and require task-specific training, hindering their cross-modal scalability. To address these limitations, we propose Uni-paint, a unified framework for multi-modal inpainting that offers various modes of guidance, including unconditional, text-driven, stroke-driven, exemplar-driven inpainting, as well as a combination of these modes. Furthermore, our Uni-paint is based on pretrained Stable Diffusion and does not require task-specific training on specific datasets, enabling few-shot generalizability to customized images. We have conducted extensive qualitative and quantitative evaluations that show our approach achieves comparable results to existing single-modal methods while offering multimodal inpainting capabilities not available in other methods. Code is available at https://github.com/ysy31415/unipaint.
引用
收藏
页码:3190 / 3199
页数:10
相关论文
共 40 条
  • [1] Modulating Pretrained Diffusion Models for Multimodal Image Synthesis
    Ham, Cusuh
    Hays, James
    Lu, Jingwan
    Singh, Krishna Kumar
    Zhang, Zhifei
    Hinz, Tobias
    PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,
  • [2] A Unified framework for Geometry and Exemplar based Image Inpainting
    Sairam, V.
    Sarma, R. Raghunatha
    Balasubramanian, S.
    Hareesh, A. Sai
    2013 IEEE SECOND INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP), 2013, : 511 - 515
  • [3] Multimodal image coregistration and partitioning - A unified framework
    Ashburner, J
    Friston, K
    NEUROIMAGE, 1997, 6 (03) : 209 - 217
  • [4] Towards Developing a Unified Multimodal Image Retrieval Framework
    Zhang, Zhongfei
    Guo, Zhen
    Zhang, Ruofei
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 1656 - +
  • [5] Image Inpainting Based on Improved Tensor Diffusion Model
    Cui Xuehong
    Pan Zhenkuan
    Wei Weibo
    ICCSE 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION: ADVANCED COMPUTER TECHNOLOGY, NEW EDUCATION, 2008, : 833 - 837
  • [6] Nonlocal Curvature-Driven Diffusion Model for Image Inpainting
    Li, Li
    Yu, Han
    FIFTH INTERNATIONAL CONFERENCE ON INFORMATION ASSURANCE AND SECURITY, VOL 2, PROCEEDINGS, 2009, : 513 - 516
  • [7] Research of Diffusion Coefficient in The Total Variation Image Inpainting Model
    He Jing
    Zhao Feng-qun
    Zhou Qian
    Zhang Pei-ru
    PROCEEDINGS OF FIRST INTERNATIONAL CONFERENCE OF MODELLING AND SIMULATION, VOL III: MODELLING AND SIMULATION IN ELECTRONICS, COMPUTING, AND BIO-MEDICINE, 2008, : 382 - 386
  • [8] A UNIFIED CONDITIONAL DISENTANGLEMENT FRAMEWORK FOR MULTIMODAL BRAIN MR IMAGE TRANSLATION
    Liu, Xiaofeng
    Xing, Fangxu
    El Fakhri, Georges
    Woo, Jonghye
    2021 IEEE 18TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 2021, : 10 - 14
  • [9] A New Oriented-Diffusion Image Inpainting Framework for Striped Texture Images
    Zhu Yong
    Wang Gui
    Han Zhike
    2009 INTERNATIONAL FORUM ON INFORMATION TECHNOLOGY AND APPLICATIONS, VOL 3, PROCEEDINGS, 2009, : 79 - 84
  • [10] ORTHOGONAL-DIRECTIONAL FORWARD DIFFUSION IMAGE INPAINTING AND DENOISING MODEL
    Wu Jiying Ruan Qiuqi An Gaoyun(Institute of Information Science
    JournalofElectronics(China), 2008, (05) : 622 - 628