Deep Learning-Based Image and Video Inpainting: A Survey

被引:10
作者
Quan, Weize [1 ,2 ]
Chen, Jiaxi [1 ,2 ]
Liu, Yanli [3 ]
Yan, Dong-Ming [1 ,2 ]
Wonka, Peter [4 ]
机构
[1] Chinese Acad Sci, Inst Automat, MAIS, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
[3] Sichuan Univ, Coll Comp Sci, Chengdu, Peoples R China
[4] King Abdullah Univ Sci & Technol, Comp Elect & Math Sci & Engn Div, Thuwal, Saudi Arabia
基金
中国国家自然科学基金;
关键词
Image inpainting; Video inpainting; Deep learning; Content generation; CROWDED SCENES; PEOPLE; NUMBER; SCALE;
D O I
10.1007/s11263-023-01977-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image and video inpainting is a classic problem in computer vision and computer graphics, aiming to fill in the plausible and realistic content in the missing areas of images and videos. With the advance of deep learning, this problem has achieved significant progress recently. The goal of this paper is to comprehensively review the deep learning-based methods for image and video inpainting. Specifically, we sort existing methods into different categories from the perspective of their high-level inpainting pipeline, present different deep learning architectures, including CNN, VAE, GAN, diffusion models, etc., and summarize techniques for module design. We review the training objectives and the common benchmark datasets. We present evaluation metrics for low-level pixel and high-level perceptional similarity, conduct a performance evaluation, and discuss the strengths and weaknesses of representative inpainting methods. We also discuss related real-world applications. Finally, we discuss open challenges and suggest potential future research directions.
引用
收藏
页码:2367 / 2400
页数:34
相关论文
共 274 条
  • [1] Ang Li, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12349), P728, DOI 10.1007/978-3-030-58548-8_42
  • [2] [Anonymous], 4 INT C LEARN REPR I
  • [3] Arjovsky M, 2017, PR MACH LEARN RES, V70
  • [4] Austin J, 2021, ADV NEUR IN
  • [5] Blended Diffusion for Text-driven Editing of Natural Images
    Avrahami, Omri
    Lischinski, Dani
    Fried, Ohad
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18187 - 18197
  • [6] Filling-in by joint interpolation of vector fields and gray levels
    Ballester, C
    Bertalmio, M
    Caselles, V
    Sapiro, G
    Verdera, J
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2001, 10 (08) : 1200 - 1211
  • [7] Baluja S, 2019, IEEE IMAGE PROC, P1700, DOI [10.1109/ICIP.2019.8803147, 10.1109/icip.2019.8803147]
  • [8] PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing
    Barnes, Connelly
    Shechtman, Eli
    Finkelstein, Adam
    Goldman, Dan B.
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2009, 28 (03):
  • [9] Image inpainting
    Bertalmio, M
    Sapiro, G
    Caselles, V
    Ballester, C
    [J]. SIGGRAPH 2000 CONFERENCE PROCEEDINGS, 2000, : 417 - 424
  • [10] Scene text removal via cascaded text stroke detection and erasing
    Bian, Xuewei
    Wang, Chaoqun
    Quan, Weize
    Ye, Juntao
    Zhang, Xiaopeng
    Yan, Dong-Ming
    [J]. COMPUTATIONAL VISUAL MEDIA, 2022, 8 (02) : 273 - 287