Deep Learning-Based Image and Video Inpainting: A Survey

被引：10

作者：

Quan, Weize ^{[1
,2
]}

Chen, Jiaxi ^{[1
,2
]}

Liu, Yanli ^{[3
]}

Yan, Dong-Ming ^{[1
,2
]}

Wonka, Peter ^{[4
]}

机构：

[1] Chinese Acad Sci, Inst Automat, MAIS, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China

[3] Sichuan Univ, Coll Comp Sci, Chengdu, Peoples R China

[4] King Abdullah Univ Sci & Technol, Comp Elect & Math Sci & Engn Div, Thuwal, Saudi Arabia

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2024年 / 132卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Image inpainting; Video inpainting; Deep learning; Content generation; CROWDED SCENES; PEOPLE; NUMBER; SCALE;

D O I：

10.1007/s11263-023-01977-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Image and video inpainting is a classic problem in computer vision and computer graphics, aiming to fill in the plausible and realistic content in the missing areas of images and videos. With the advance of deep learning, this problem has achieved significant progress recently. The goal of this paper is to comprehensively review the deep learning-based methods for image and video inpainting. Specifically, we sort existing methods into different categories from the perspective of their high-level inpainting pipeline, present different deep learning architectures, including CNN, VAE, GAN, diffusion models, etc., and summarize techniques for module design. We review the training objectives and the common benchmark datasets. We present evaluation metrics for low-level pixel and high-level perceptional similarity, conduct a performance evaluation, and discuss the strengths and weaknesses of representative inpainting methods. We also discuss related real-world applications. Finally, we discuss open challenges and suggest potential future research directions.

引用

页码：2367 / 2400

页数：34

共 274 条

[31] Region filling and object removal by exemplar-based image inpainting
Criminisi, A
Pérez, P
Toyama, K
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2004, 13 (09) : 1200 - 1212
[32] Diffusion Models in Vision: A Survey
Croitoru, Florinel-Alin
Hondru, Vlad
Ionescu, Radu Tudor
Shah, Mubarak
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (09) : 10850 - 10869
[33] Adaptive Image Sampling Using Deep Learning and Its Application on X-Ray Fluorescence Image Reconstruction
Dai, Qiqin
Chopp, Henry
Pouyet, Emeline
Cossairt, Oliver
Walton, Marc
Katsaggelos, Aggelos K.
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (10) : 2564 - 2578
[34] Image Melding: Combining Inconsistent Images using Patch-based Synthesis
Darabi, Soheil
Shechtman, Eli
Barnes, Connelly
Goldman, Dan B.
Sen, Pradeep
[J]. ACM TRANSACTIONS ON GRAPHICS, 2012, 31 (04):
[35] THE WAVELET TRANSFORM, TIME-FREQUENCY LOCALIZATION AND SIGNAL ANALYSIS
DAUBECHIES, I
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1990, 36 (05) : 961 - 1005
[36] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[37] Hourglass Attention Network for Image Inpainting
Deng, Ye
Hui, Siqi
Meng, Rongye
Zhou, Sanping
Wang, Jinjun
[J]. COMPUTER VISION - ECCV 2022, PT XVIII, 2022, 13678 : 483 - 501
[38] Learning Contextual Transformer Network for Image Inpainting
Deng, Ye
Hui, Siqi
Zhou, Sanping
Meng, Deyu
Wang, Jinjun
[J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2529 - 2538
[39] Arbitrary Style Transfer via Multi-Adaptation Network
Deng, Yingying
Tang, Fan
Dong, Weiming
Sun, Wen
Huang, Feiyue
Xu, Changsheng
[J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2719 - 2727
[40] Dinh L., 2014, INT C LEARN REPRESEN

← 1 2 3 4 5 6 7 8 9 10 →