Hierarchical Vector-Quantized Variational Autoencoder and Vector Credibility Mechanism for High-Quality Image Inpainting

被引：0

作者：

Li, Cheng ^{[1
]}

Xu, Dan ^{[1
]}

Chen, Kuai ^{[2
]}

机构：

[1] Yunnan Univ, Sch Informat Sci & Engn, Kunming 650106, Peoples R China

[2] Yunnan Univ, Sch Govt, Kunming 650106, Peoples R China

来源：

ELECTRONICS | 2024年 / 13卷 / 10期

基金：

中国国家自然科学基金;

关键词：

image inpainting; VQ-VAE; vector credibility; codebook;

D O I：

10.3390/electronics13101852

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Image inpainting infers the missing areas of a corrupted image according to the information of the undamaged part. Many existing image inpainting methods can generate plausible inpainted results from damaged images with the fast-developed deep-learning technology. However, they still suffer from over-smoothed textures or textural distortion in the cases of complex textural details or large damaged areas. To restore textures at a fine-grained level, we propose an image inpainting method based on a hierarchical VQ-VAE with a vector credibility mechanism. It first trains the hierarchical VQ-VAE with ground truth images to update two codebooks and to obtain two corresponding vector collections containing information on ground truth images. The two vector collections are fed to a decoder to generate the corresponding high-fidelity outputs. An encoder then is trained with the corresponding damaged image. It generates vector collections approximating the ground truth by the help of the prior knowledge provided by the codebooks. After that, the two vector collections pass through the decoder from the hierarchical VQ-VAE to produce the inpainted results. In addition, we apply a vector credibility mechanism to promote vector collections from damaged images and approximate vector collections from ground truth images. To further improve the inpainting result, we apply a refinement network, which uses residual blocks with different dilation rates to acquire both global information and local textural details. Extensive experiments conducted on several datasets demonstrate that our method outperforms the state-of-the-art ones.

引用

页数：17

共 50 条

[1] Vector-Quantized Variational AutoEncoder for pansharpening
Talbi, Farid
Elmezouar, Miloud Chikr
Boutellaa, Elhocine
Alim, Fatiha
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (20) : 6329 - 6349
[2] Generating High-Quality F0 Embeddings Using the Vector-Quantized Variational Autoencoder
Portes, David
Horak, Ales
TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT II, 2024, 15049 : 139 - 148
[3] Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation
Zhao, Zhe
Qi, Mengshi
Ma, Huadong
COMPUTER VISION - ECCV 2024, PT XXIX, 2025, 15087 : 447 - 463
[4] Leveraging Vector-Quantized Variational Autoencoder Inner Metrics for Anomaly Detection
Gangloff, Hugo
Pham, Minh-Tan
Courtrai, Luc
Lefevre, Sebastien
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 435 - 441
[5] Vector-quantized Variational Autoencoder for Phase-aware Speech Enhancement
Tuan Vu Ho
Quoc Huy Nguyen
Akagi, Masato
Unoki, Masashi
INTERSPEECH 2022, 2022, : 176 - 180
[6] Vector-Quantized Autoencoder With Copula for Collaborative Filtering
Wang, Guanyu
Zhong, Ting
Xu, Xovee
Zhang, Kunpeng
Zhou, Fan
Wang, Yong
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3458 - 3462
[7] Quaternion Vector Quantized Variational Autoencoder
Luo, Hui
Liu, Xin
Sun, Jian
Zhang, Yang
IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 151 - 155
[8] CRANK: AN OPEN-SOURCE SOFTWARE FOR NONPARALLEL VOICE CONVERSION BASED ON VECTOR-QUANTIZED VARIATIONAL AUTOENCODER
Kobayashi, Kazuhiro
Huang, Wen-Chin
Wu, Yi-Chiao
Tobing, Patrick Lumban
Hayashi, Tomoki
Toda, Tomoki
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5934 - 5938
[9] Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis
Gu, Yuchao
Wang, Xintao
Ge, Yixiao
Shan, Ying
Shou, Mike Zheng
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 7631 - 7640
[10] Bone-conducted Speech Enhancement Using Vector-quantized Variational Autoencoder and Gammachirp Filterbank Cepstral Coefficients
Quoc-Huy Nguyen
Unoki, Masashi
2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 21 - 25

← 1 2 3 4 5 →