Resolution-robust Large Mask In painting with Fourier Convolutions

被引:520
作者
Suvorov, Roman [1 ]
Logacheva, Elizaveta [1 ]
Mashikhin, Anton [1 ]
Remizova, Anastasia [1 ,3 ]
Ashukha, Arsenii [1 ]
Silvestrov, Aleksei [1 ]
Kong, Naejin [2 ]
Goka, Harshith [2 ]
Park, Kiwoong [2 ]
Lempitsky, Victor [1 ,4 ]
机构
[1] Samsung AI Ctr Moscow, Moscow, Russia
[2] Samsung Res, Suwon, South Korea
[3] Swiss Fed Inst Technol Lausanne EPFL, Lausanne, Switzerland
[4] Skolkovo Inst Sci & Technol, Moscow, Russia
来源
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022) | 2022年
关键词
D O I
10.1109/WACV51458.2022.00323
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern image inpainting systems, despite the significant progress, often struggle with large missing areas, complex geometric structures, and high-resolution images. We find that one of the main reasons for that is the lack of an effective receptive field in both the inpainting network and the loss function. To alleviate this issue, we propose a new method called large mask inpainting (LaMa). LaMa is based on i) a new inpainting network architecture that uses fast Fourier convolutions (FFCs), which have the imagewide receptive field; ii) a high receptive field perceptual loss; iii) large training masks, which unlocks the potential of the first two components. Our inpainting network improves the state-of-the-art across a range of datasets and achieves excellent performance even in challenging scenarios, e.g. completion of periodic structures. Our model generalizes surprisingly well to resolutions that are higher than those seen at train time, and achieves this at lower parameter&time costs than the competitive baselines. The code is available at https://github.com/saic-mdal/lama.
引用
收藏
页码:3172 / 3182
页数:11
相关论文
共 67 条
[1]  
Bertalmio M, 2003, PROC CVPR IEEE, P707
[3]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[4]  
Chi L., 2020, P INT C NEUR INF PRO, P4479, DOI DOI 10.5555/3495724.3496100
[5]  
Criminisi Antonio, 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, V2, pII
[6]  
Dosovitskiy A, 2020, ARXIV
[7]   IMPROVING GENERALIZATION PERFORMANCE USING DOUBLE BACKPROPAGATION [J].
DRUCKER, H ;
LECUN, Y .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (06) :991-997
[8]   Taming Transformers for High-Resolution Image Synthesis [J].
Esser, Patrick ;
Rombach, Robin ;
Ommer, Bjoern .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :12868-12878
[9]  
Falcon W., 2019, PyTorch Lightning
[10]  
Geirhos R., 2019, INT C LEARN REPR