CustomSketching: Sketch Concept Extraction for Sketch-based Image Synthesis and Editing

被引：0

作者：

Xiao, Chufeng ^{[1
,2
]}

Fu, Hongbo ^{[3
]}

机构：

[1] Hong Kong Univ Sci & Technol, HKGAI, Hong Kong, Peoples R China

[2] City Univ Hong Kong, Sch Creat Media, Hong Kong, Peoples R China

[3] Hong Kong Univ Sci & Technol, Div Arts & Machine Creat, Hong Kong, Peoples R China

来源：

COMPUTER GRAPHICS FORUM | 2024年 / 43卷 / 07期

关键词：

CCS Concepts; • Computing methodologies → Image manipulation;

D O I：

10.1111/cgf.15247

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Personalization techniques for large text-to-image (T2I) models allow users to incorporate new concepts from reference images. However, existing methods primarily rely on textual descriptions, leading to limited control over customized images and failing to support fine-grained and local editing (e.g., shape, pose, and details). In this paper, we identify sketches as an intuitive and versatile representation that can facilitate such control, e.g., contour lines capturing shape information and flow lines representing texture. This motivates us to explore a novel task of sketch concept extraction: given one or more sketch-image pairs, we aim to extract a special sketch concept that bridges the correspondence between the images and sketches, thus enabling sketch-based image synthesis and editing at a fine-grained level. To accomplish this, we introduce CustomSketching, a two-stage framework for extracting novel sketch concepts via few-shot learning. Considering that an object can often be depicted by a contour for general shapes and additional strokes for internal details, we introduce a dual-sketch representation to reduce the inherent ambiguity in sketch depiction. We employ a shape loss and a regularization loss to balance fidelity and editability during optimization. Through extensive experiments, a user study, and several applications, we show our method is effective and superior to the adapted baselines.

引用

页数：12

共 72 条

[1] Abdal Rameen, 2022, SIGGRAPH22 Conference Proceeding: Special Interest Group on Computer Graphics and Interactive Techniques Conference Proceedings, DOI 10.1145/3528233.3530747
[2] Avrahami O., 2023, arXiv preprint arXiv:2305.16311
[3] Avrahami O., 2022, P IEEE CVF C COMPUTE
[4] Blended Latent Diffusion
Avrahami, Omri
Fried, Ohad
Lischinski, Dani
[J]. ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (04):
[5] Bau David, 2021, arXiv preprint arXiv:2103.10951
[6] Brock A., 2018, INT C LEARN REPR
[7] Brooks T., 2023, P IEEE CVF C COMPUTE
[8] Cao M., 2023, P IEEE CVF INT C COM
[9] Caron M., 2021, P IEEE CVF INT C COM
[10] Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models
Chefer, Hila
Alaluf, Yuval
Vinker, Yael
Wolf, Lior
Cohen-Or, Daniel
[J]. ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (04):

← 1 2 3 4 5 6 7 8 →