Rethinking Image Deblurring via CNN-Transformer Multiscale Hybrid Architecture

被引:0
|
作者
Zhao, Qian [1 ]
Yang, Hao [1 ]
Zhou, Dongming [1 ]
Cao, Jinde [2 ,3 ]
机构
[1] Yunnan Univ, Sch Informat Sci & Engn, Kunming 650091, Peoples R China
[2] Southeast Univ, Sch Math, Nanjing 210096, Peoples R China
[3] Yonsei Univ, Yonsei Frontier Lab, Seoul 03722, South Korea
基金
中国国家自然科学基金;
关键词
Image deblurring; motion blur; multiscale strategy; neural networks; vision transformer (ViT);
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Image deblurring is a representative low-level vision task that aims to estimate latent sharp images from blurred images. Recently, convolutional neural network (CNN)-based methods have dominated image deblurring. However, traditional CNN-based deblurring methods suffer from two essential issues: first, existing multiscale deblurring methods process blurred images at different scales through sub-networks with the same composition, which limits the model performance. Second, the convolutional layers fail to adapt to the input content and cannot effectively capture long-range dependencies. To alleviate the above issues, we rethink the multiscale architecture that follows a coarse-to-fine strategy and propose a novel hybrid architecture that combines CNN and transformer (CTMS). CTMS has three distinct features. First, the finer-scale sub-networks in CTMS are designed as architectures with larger receptive fields to obtain the pixel values around the blur, which can be used to efficiently handle large-area blur. Then, we propose a feature modulation network to alleviate the disadvantages of CNN sub-networks that lack input content adaptation. Finally, we design an efficient transformer block, which significantly reduces the computational burden and requires no pre-training. Our proposed deblurring model is extensively evaluated on several benchmark datasets, and achieves superior performance compared to state-of-the-art deblurring methods. Especially, the peak signal to noise ratio (PSNR) and structural similarity (SSIM) values are 32.73 dB and 0.959, respectively, on the popular dataset GoPro. In addition, we conduct joint evaluation experiments on the proposed method deblurring performance, object detection, and image segmentation to demonstrate the effectiveness of CTMS for subsequent high-level computer vision tasks.
引用
收藏
页数:15
相关论文
共 50 条
  • [11] A hybrid CNN-Transformer model for Historical Document Image Binarization
    Rezanezhad, Vahid
    Baierer, Konstantin
    Neudecker, Clemens
    PROCEEDINGS OF THE 2023 INTERNATIONAL WORKSHOP ON HISTORICAL DOCUMENT IMAGING AND PROCESSING, HIP 2023, 2023, : 79 - 84
  • [12] A Hybrid CNN-Transformer Architecture for Semantic Segmentation of Radar Sounder data
    Ghosh, Raktim
    Bovolo, Francesca
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 1320 - 1323
  • [13] Wild horseshoe crab image denoising based on CNN-transformer architecture
    Han, Lili
    Liu, Xiuping
    Wang, Qingqing
    Xu, Tao
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [14] Hybrid Multiscale SAR Ship Detector With CNN-Transformer and Adaptive Fusion Loss
    Wang, Fei
    Chen, Chengcheng
    Zeng, Weiming
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21
  • [15] ENHANCING HYBRID CNN-TRANSFORMER VIA FREQUENCY-BASED BRIDGING FOR MEDICAL IMAGE SEGMENTATION
    Zeng Xinyi
    Tang Cheng
    Zeng Pinxian
    Cui Jiaqi
    Yan Binyu
    Wang Peng
    Wang Yan
    IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI 2024, 2024,
  • [16] HCTNet: A hybrid CNN-transformer network for breast ultrasound image segmentation
    He, Qiqi
    Yang, Qiuju
    Xie, Minghao
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 155
  • [17] A CNN-Transformer Hybrid Model Based on CSWin Transformer for UAV Image Object Detection
    Lu, Wanjie
    Lan, Chaozhen
    Niu, Chaoyang
    Liu, Wei
    Lyu, Liang
    Shi, Qunshan
    Wang, Shiju
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 1211 - 1231
  • [18] Single-Image Superresolution for RGB Remote Sensing Imagery via Multiscale CNN-Transformer Feature Fusion
    Yao, Xudong
    Zhang, Haopeng
    Wen, Sizhe
    Shi, Zhenwei
    Jiang, Zhiguo
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 1302 - 1316
  • [19] UTNETPARA: A HYBRID CNN-TRANSFORMER ARCHITECTURE WITH MULTI-SCALE FUSION FOR WHOLE-SLIDE IMAGE SEGMENTATION
    Huang, Boqiang
    Ying, Jiayu
    Lyu, Ruizhi
    Schaadt, Nadine S.
    Klinkhammer, Barbara M.
    Boor, Peter
    Lotz, Johannes
    Feuerhake, Friedrich
    Merhof, Dorit
    IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI 2024, 2024,
  • [20] TransUMobileNet: Integrating multi-channel attention fusion with hybrid CNN-Transformer architecture for medical image segmentation
    Cai, Sijing
    Jiang, Yukun
    Xiao, Yuwei
    Zeng, Jian
    Zhou, Guangming
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 107