Fine-Grained Image Editing Using ControlNet: Expanding Possibilities in Visual Manipulation

被引：0

作者：

Xu, Longfei ^{[1
]}

Huang, Hongbo ^{[1
]}

Zhao, Yushuang ^{[1
]}

Pan, Shuwen ^{[1
]}

Zheng, Yaolin ^{[1
]}

Yan, Xiaoxu ^{[1
]}

Huang, Linkai ^{[1
]}

Wu, Lishan ^{[1
]}

机构：

[1] Beijing Informat Sci & Technol Univ, Beijing, Peoples R China

来源：

ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT VI, ICIC 2024 | 2024年 / 14867卷

基金：

中国国家自然科学基金;

关键词：

Diffusion Probabilistic Model; Controlnet; Image Editing;

D O I：

10.1007/978-981-97-5597-4_3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, diffusion probabilistic models have emerged as a hot topic in computer vision. Image creation programs such as Imagen, Latent Diffusion Models, and Stable Diffusion have shown outstanding generative powers, sparking considerable community discussions. They frequently, however, lack the ability to precisely modify real-world images. In this paper, we propose a novel ControlNet-based image editing framework that enables alteration of real images based on pose maps, scribbling maps, and other features without the need for training or fine-tuning. Given a guiding image as input, we edit the initial noise generated from the guiding image to influence the generation process. Then features extracted from the guiding image are directly injected into the generation process of the translated image. We also construct a classifier guidance based on strong correspondences between intermediate features of the ControlNet branches. The editing signals are converted into gradients to guide the sampling direction. At the end of this paper, we demonstrate high-quality results of our proposed model in image editing tasks.

引用

页码：27 / 38

页数：12