Deep Learning-Based Image and Video Inpainting: A Survey

被引：22

作者：

Quan, Weize ^{[1
,2
]}

Chen, Jiaxi ^{[1
,2
]}

Liu, Yanli ^{[3
]}

Yan, Dong-Ming ^{[1
,2
]}

Wonka, Peter ^{[4
]}

机构：

[1] Chinese Acad Sci, Inst Automat, MAIS, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China

[3] Sichuan Univ, Coll Comp Sci, Chengdu, Peoples R China

[4] King Abdullah Univ Sci & Technol, Comp Elect & Math Sci & Engn Div, Thuwal, Saudi Arabia

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2024年 / 132卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Image inpainting; Video inpainting; Deep learning; Content generation; CROWDED SCENES; PEOPLE; NUMBER;

D O I：

10.1007/s11263-023-01977-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Image and video inpainting is a classic problem in computer vision and computer graphics, aiming to fill in the plausible and realistic content in the missing areas of images and videos. With the advance of deep learning, this problem has achieved significant progress recently. The goal of this paper is to comprehensively review the deep learning-based methods for image and video inpainting. Specifically, we sort existing methods into different categories from the perspective of their high-level inpainting pipeline, present different deep learning architectures, including CNN, VAE, GAN, diffusion models, etc., and summarize techniques for module design. We review the training objectives and the common benchmark datasets. We present evaluation metrics for low-level pixel and high-level perceptional similarity, conduct a performance evaluation, and discuss the strengths and weaknesses of representative inpainting methods. We also discuss related real-world applications. Finally, we discuss open challenges and suggest potential future research directions.

引用

页码：2367 / 2400

页数：34

共 274 条

[1]

Al-Maadeed S., 2020, NEURAL PROCESS LETT, V51, P2007, DOI DOI 10.1007/s11063-019-10163-0

[2]

Arjovsky M, 2017, PR MACH LEARN RES, V70

[3]

Austin J, 2021, ADV NEUR IN

[4] Blended Diffusion for Text-driven Editing of Natural Images [J].

Avrahami, Omri ;

Lischinski, Dani ;

Fried, Ohad .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :18187-18197

[5] Filling-in by joint interpolation of vector fields and gray levels [J].

Ballester, C ;

Bertalmio, M ;

Caselles, V ;

Sapiro, G ;

Verdera, J .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2001, 10 (08) :1200-1211

[6]

Baluja S, 2019, IEEE IMAGE PROC, P1700, DOI [10.1109/icip.2019.8803147, 10.1109/ICIP.2019.8803147]

[7] PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing [J].

Barnes, Connelly ;

Shechtman, Eli ;

Finkelstein, Adam ;

Goldman, Dan B. .

ACM TRANSACTIONS ON GRAPHICS, 2009, 28 (03)

[8] Image inpainting [J].

Bertalmio, M ;

Sapiro, G ;

Caselles, V ;

Ballester, C .

SIGGRAPH 2000 CONFERENCE PROCEEDINGS, 2000, :417-424

[9] Scene text removal via cascaded text stroke detection and erasing [J].

Bian, Xuewei ;

Wang, Chaoqun ;

Quan, Weize ;

Ye, Juntao ;

Zhang, Xiaopeng ;

Yan, Dong-Ming .

COMPUTATIONAL VISUAL MEDIA, 2022, 8 (02) :273-287

[10] The Perception-Distortion Tradeoff [J].

Blau, Yochai ;

Michaeli, Tomer .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6228-6237

← 1 2 3 4 5 6 7 8 9 10 →