HCT: image super-resolution restoration using hierarchical convolution transformer networks

被引:0
作者
Guo, Ying [1 ,2 ]
Tian, Chang [1 ]
Wang, Han [1 ]
Liu, Jie [1 ]
Di, Chong [3 ]
Ning, Keqing [1 ]
机构
[1] North China Univ Technol, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Beijing 100084, Peoples R China
[3] Qilu Univ Technol, Shandong Artificial Intelligence Inst, Shandong Acad Sci, Jinan 250353, Peoples R China
基金
中国国家自然科学基金;
关键词
Hierarchical convolution network; Swin transformer; Image super-resolution;
D O I
10.1007/s10044-025-01413-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the computer vision domain, image super-resolution (SR) technology, which restores high-resolution details from low-resolution images, plays a vital role in practical applications such as medical imaging, public safety, and remote sensing. Traditional methods employ convolutional neural networks to address these issues, while Visual Transformers show potential performance in high-level vision tasks. However, compared to typical CNN architecture networks, Visual Transformers exhibit weaker reliance on high-frequency information in images, leading to blurred details and residual artifacts. To solve this issue, we use a hierarchical network structure, which allows for a more flexible feeling field for our approach. Firstly, our method complements lost spatial features using a Convolutional Swin Transformer Layer incorporating a Convolutional Feed Forward Network. This allows for the retrieval of missing spatial information and enhances the model's representational capabilities. Next, deep feature extraction is performed by combining multiple layers into a Residual Convolutional Swin Transformer Block. Finally, we employ a hierarchical-type structure to combine the features of each branch. Experiments validate the effectiveness of the proposed method in generating images with greater detail aligned with human perception. Based on the experiments, our method is effective on SR tasks with magnification factors of 2, 3, and 4. Our method can reconstruct a clear and complete edge structure. We provide code at https://github.com/Q88392/HCT.
引用
收藏
页数:11
相关论文
共 34 条
  • [1] NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study
    Agustsson, Eirikur
    Timofte, Radu
    [J]. 2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 1122 - 1131
  • [2] OverNet: Lightweight Multi-Scale Super-Resolution with Overscaling Network
    Behjati, Parichehr
    Rodriguez, Pau
    Mehri, Armin
    Hupont, Isabelle
    Fernandez Tena, Carles
    Gonzalez, Jordi
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2693 - 2702
  • [3] Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding
    Bevilacqua, Marco
    Roumy, Aline
    Guillemot, Christine
    Morel, Marie-Line Alberi
    [J]. PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,
  • [4] Pre-Trained Image Processing Transformer
    Chen, Hanting
    Wang, Yunhe
    Guo, Tianyu
    Xu, Chang
    Deng, Yiping
    Liu, Zhenhua
    Ma, Siwei
    Xu, Chunjing
    Xu, Chao
    Gao, Wen
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12294 - 12305
  • [5] Chen XY, 2024, Arxiv, DOI [arXiv:2309.05239, DOI 10.48550/ARXIV.2309.05239]
  • [6] Dual Aggregation Transformer for Image Super-Resolution
    Chen, Zheng
    Zhang, Yulun
    Gu, Jinjin
    Kong, Linghe
    Yang, Xiaokang
    Yu, Fisher
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 12278 - 12287
  • [7] Chen Zheng, 2022, Advances in Neural Information Processing Systems
  • [8] Second-order Attention Network for Single Image Super-Resolution
    Dai, Tao
    Cai, Jianrui
    Zhang, Yongbing
    Xia, Shu-Tao
    Zhang, Lei
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11057 - 11066
  • [9] DaViT: Dual Attention Vision Transformers
    Ding, Mingyu
    Xiao, Bin
    Codella, Noel
    Luo, Ping
    Wang, Jingdong
    Yuan, Lu
    [J]. COMPUTER VISION, ECCV 2022, PT XXIV, 2022, 13684 : 74 - 92
  • [10] Accelerating the Super-Resolution Convolutional Neural Network
    Dong, Chao
    Loy, Chen Change
    Tang, Xiaoou
    [J]. COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 391 - 407