PLGNet: Prior-Guided Local and Global Interactive Hybrid Network for Face Super-Resolution

被引：1

作者：

Li, Ling ^{[1
]}

Zhang, Yan ^{[1
]}

Yuan, Lin ^{[1
]}

Gao, Xinbo ^{[1
]}

机构：

[1] Chongqing Univ Posts & Telecommun, Coll Comp Sci & Technol, Chongqing Key Lab Image Cognit, Chongqing 400065, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Image reconstruction; Faces; Face recognition; Transformers; Superresolution; Semantics; Feature extraction; Face super-resolution (FSR); facial prior; attention aggregation; transformer; TRANSFORMER; ALIGNMENT;

D O I：

10.1109/TCSVT.2024.3403713

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Recent CNN-driven face super-resolution (FSR) technologies have achieved excellent breakthroughs by incorporating facial prior knowledge. However, most of them suffer from some obvious limitations. They always estimate facial priors from input low-resolution (LR) faces or coarsely enhanced LR faces, obtaining unfaithful priors that cannot be adequately exploited. This may bring noticeable artifacts to the target results, especially for large scaling factors, deteriorating the fidelity and naturalness and generating suboptimal reconstructed results. In this paper, we propose a two-stage prior-guided FSR approach to learn facial prior knowledge from the optimal SR results of stage one and explore the complementarity between priors to further guide more accurate reconstruction in stage two. Specifically, we develop an efficient local and global interactive hybrid network incorporating facial semantic and geometric priors for more discriminative results. To reach this, we devise a multiscale interconnected symmetric encoder-decoder architecture composed of Prior Interaction-Integration Modules (PIIMs), the Coarse-to-fine Feature Refinement Module (CFRM), and Feature Aggregation Modulation Modules (FAMMs). The encoder concentrates on hierarchically extracting multiscale features. The CFRM is devised to explore the potential correlations between the encoder and the decoder and further guide the refinement and reinforcement of the encoded features. The decoder aims to take full advantage of informative multiscale encoded features to reconstruct high-quality SR representations. Comprehensive evaluation and visualization results on four benchmark datasets demonstrate the superiority of the proposed PLGNet over current state-of-the-art methods. The source code of PLGNet will be available at https://github.com/lil808/PLGNet.git.

引用

页码：10166 / 10181

页数：16