Deep Learning-Based Image and Video Inpainting: A Survey

被引：10

作者：

Quan, Weize ^{[1
,2
]}

Chen, Jiaxi ^{[1
,2
]}

Liu, Yanli ^{[3
]}

Yan, Dong-Ming ^{[1
,2
]}

Wonka, Peter ^{[4
]}

机构：

[1] Chinese Acad Sci, Inst Automat, MAIS, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China

[3] Sichuan Univ, Coll Comp Sci, Chengdu, Peoples R China

[4] King Abdullah Univ Sci & Technol, Comp Elect & Math Sci & Engn Div, Thuwal, Saudi Arabia

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2024年 / 132卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Image inpainting; Video inpainting; Deep learning; Content generation; CROWDED SCENES; PEOPLE; NUMBER; SCALE;

D O I：

10.1007/s11263-023-01977-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Image and video inpainting is a classic problem in computer vision and computer graphics, aiming to fill in the plausible and realistic content in the missing areas of images and videos. With the advance of deep learning, this problem has achieved significant progress recently. The goal of this paper is to comprehensively review the deep learning-based methods for image and video inpainting. Specifically, we sort existing methods into different categories from the perspective of their high-level inpainting pipeline, present different deep learning architectures, including CNN, VAE, GAN, diffusion models, etc., and summarize techniques for module design. We review the training objectives and the common benchmark datasets. We present evaluation metrics for low-level pixel and high-level perceptional similarity, conduct a performance evaluation, and discuss the strengths and weaknesses of representative inpainting methods. We also discuss related real-world applications. Finally, we discuss open challenges and suggest potential future research directions.

引用

页码：2367 / 2400

页数：34

共 274 条

[71] Deep Fusion Network for Image Completion
Hong, Xin
Xiong, Pengfei
Ji, Renhe
Fan, Haoqiang
[J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2033 - 2042
[72] Hongyu Liu, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12347), P725, DOI 10.1007/978-3-030-58536-5_43
[73] Hoogeboom E, 2021, 35 C NEURAL INFORM P, V34
[74] Local Intrinsic Dimensionality I: An Extreme-Value-Theoretic Foundation for Similarity Applications
Houle, Michael E.
[J]. SIMILARITY SEARCH AND APPLICATIONS, SISAP 2017, 2017, 10609 : 64 - 79
[75] Local Intrinsic Dimensionality II: Multivariate Analysis and Distributional
Houle, Michael E.
[J]. SIMILARITY SEARCH AND APPLICATIONS, SISAP 2017, 2017, 10609 : 80 - 95
[76] Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/CVPR.2018.00745, 10.1109/TPAMI.2019.2913372]
[77] Image Completion using Planar Structure Guidance
Huang, Jia-Bin
Kang, Sing Bing
Ahuja, Narendra
Kopf, Johannes
[J]. ACM TRANSACTIONS ON GRAPHICS, 2014, 33 (04):
[78] Temporally Coherent Completion of Dynamic Video
Huang, Jia-Bin
Kang, Sing Bing
Ahuja, Narendra
Kopf, Johannes
[J]. ACM TRANSACTIONS ON GRAPHICS, 2016, 35 (06):
[79] Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization
Huang, Xun
Belongie, Serge
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1510 - 1519
[80] Hui Z, 2020, Arxiv, DOI arXiv:2002.02609

← 3 4 5 6 7 8 9 10 11 12 →