Collaborative Contrastive Learning-Based Generative Model for Image Inpainting

被引:0
作者
Du, Yongqiang [1 ]
Liu, Haoran [1 ]
Chen, Songnan [2 ]
机构
[1] Xinyang Agr & Forestry Univ, Sch Informat Engn, Xinyang 464000, Peoples R China
[2] Wuhan Polytech Univ, Sch Math & Comp, Wuhan 430048, Peoples R China
关键词
Image inpainting; semantic reasoning; contrastive learning; generative model; EDGE;
D O I
10.1109/ACCESS.2022.3211961
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The critical challenge of image inpainting is to infer reasonable semantics and textures for a corrupted image. Typical methods for image inpainting are built upon some prior knowledge to synthesize the complete image. One potential limitation is that those methods often remain undesired blurriness or semantic mistakes in the synthesized image while handling images with large corrupted areas. In this paper, we propose a Collaborative Contrastive Learning-based Generative Model (C2LGM), which learns the content consistency in the same image to ensure that the inferred content of corrupted areas is reasonable compared to the known content by pixel-level reconstruction and high-level semantic reasoning. C2LGM leverages the encoder-decoder based framework to directly learn the mapping from the corrupted image to the intact image and perform the pixel-level reconstruction. To perform semantic reasoning, our C2LGM introduces a Collaborative Contrastive Learning (C2L) mechanism that learns high-level semantic consistency between inferred and known content. Specifically, C2L mechanism introduces the high-frequency edge maps to participate in the process of typical contrastive learning and enables the deep model to ensure the semantic reasonableness between high-frequency structures and pixel-level content by pushing the representations of inferred content and known content close and keeping unrelated semantic content away in the latent feature space. Moreover, C2LGM also directly absorbs the prior knowledge of structural information from the proposed structural spatial attention module, and leverages the texture distribution sampling to improve the quality of synthesized content. As a result, our C2LGM achieves a 0.42 dB improvement over competing methods in terms of the PSNR metric while coping with a 40 similar to 50% corruption ratio in the Places2 dataset. Extensive experiments on three benchmark datasets, including Paris Street View, CelebA-HQ, and Places2, demonstrate the advantages of our proposed C2LGM over other state-of-the-art methods for image inpainting both qualitatively and quantitatively.
引用
收藏
页码:106641 / 106654
页数:14
相关论文
共 66 条
[1]  
[Anonymous], INT C LEARNING REPRE
[2]   Filling-in by joint interpolation of vector fields and gray levels [J].
Ballester, C ;
Bertalmio, M ;
Caselles, V ;
Sapiro, G ;
Verdera, J .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2001, 10 (08) :1200-1211
[3]   PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing [J].
Barnes, Connelly ;
Shechtman, Eli ;
Finkelstein, Adam ;
Goldman, Dan B. .
ACM TRANSACTIONS ON GRAPHICS, 2009, 28 (03)
[4]   Image inpainting [J].
Bertalmio, M ;
Sapiro, G ;
Caselles, V ;
Ballester, C .
SIGGRAPH 2000 CONFERENCE PROCEEDINGS, 2000, :417-424
[5]   Deep learning-based image inpainting with structure map [J].
Bo, Dezhi ;
Ma, Ran ;
Wang, Keke ;
Su, Min ;
An, Ping .
JOURNAL OF ELECTRONIC IMAGING, 2021, 30 (03)
[7]   GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution [J].
Chan, Kelvin C. K. ;
Wang, Xintao ;
Xu, Xiangyu ;
Gu, Jinwei ;
Loy, Chen Change .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :14240-14249
[8]   Hierarchical Cross-Modal Talking Face Generation with Dynamic Pixel-Wise Loss [J].
Chen, Lele ;
Maddox, Ross K. ;
Duan, Zhiyao ;
Xu, Chenliang .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7824-7833
[9]  
Chen T, 2020, PR MACH LEARN RES, V119
[10]   Region filling and object removal by exemplar-based image inpainting [J].
Criminisi, A ;
Pérez, P ;
Toyama, K .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2004, 13 (09) :1200-1212