Spatial-Spectral Aggregation Transformer With Diffusion Prior for Hyperspectral Image Super-Resolution

被引:4
作者
Zhang, Mingyang [1 ]
Wang, Xiangyu [1 ]
Wu, Shuang [1 ]
Wang, Zhaoyang [1 ]
Gong, Maoguo [2 ,3 ]
Zhou, Yu [1 ]
Jiang, Fenlong [4 ]
Wu, Yue [4 ]
机构
[1] Xidian Univ, Sch Elect Engn, Key Lab Collaborat Intelligence Syst, Minist Educ, Xian 710071, Peoples R China
[2] Xidian Univ, Minist Educ, Key Lab Collaborat Intelligence Syst, Xian 710071, Peoples R China
[3] Inner Mongolia Normal Univ, Coll Math Sci, Hohhot 010028, Peoples R China
[4] Xidian Univ, Sch Comp Sci & Technol, Key Lab Collaborat Intelligence Syst, Minist Educ, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
Hyperspectral image super-resolution; prior features; attention mechanism; transformer; diffusion model;
D O I
10.1109/TCSVT.2024.3508844
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Constrained by imaging systems, hyperspectral images (HSIs) always have a low spatial resolution. Deep learning-based HSI super-resolution methods have achieved impressive results through learning the nonlinear mapping between low-resolution (LR) and high-resolution (HR) images. However, most of them take the LR image or its upsampled version through bicubic interpolation as input, leading to low-quality features and limited details captured by the network. As a powerful generative model, diffusion model has the ability to learn both contextual semantics and textual details from distinct timesteps, enabling the effective exploration of spatial-spectral distributions in high-dimensional data. In this paper, we propose a novel method that extracts high-quality prior information from original images to assist in super-resolution through pretraining a diffusion model. Specifically, we first train a diffusion model using original HSI patches in a self-supervised manner and then obtain prior features from the pretrained denoising U-Net decoder. To efficiently incorporate the prior features into the super-resolution model, we propose an adaptive fusion module based on spatial and spectral attention mechanisms, which enhances features in both dimensions while preserving the original characteristics. Additionally, to leverage the complementarity of spatial and spectral information, we design a spatial-spectral aggregation Transformer module that incorporates an adaptive interaction module to facilitate information exchange across different dimensions, thereby enhancing the representation capability. Extensive experiments on three public hyperspectral datasets demonstrate that the proposed method achieves excellent super-resolution performance and outperforms the state-of-the-art methods in terms of quantitative quality and visual results.
引用
收藏
页码:3557 / 3572
页数:16
相关论文
共 71 条
[1]  
Akhtar N, 2015, PROC CVPR IEEE, P3631, DOI 10.1109/CVPR.2015.7298986
[2]   Vision Transformers in Image Restoration: A Survey [J].
Ali, Anas M. ;
Benjdira, Bilel ;
Koubaa, Anis ;
El-Shafai, Walid ;
Khan, Zahid ;
Boulila, Wadii .
SENSORS, 2023, 23 (05)
[3]  
Austin J, 2021, ADV NEUR IN
[4]   Blended Diffusion for Text-driven Editing of Natural Images [J].
Avrahami, Omri ;
Lischinski, Dani ;
Fried, Ohad .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :18187-18197
[5]  
Baranchuk D., 2021, P INT C LEARN REPR I, P1
[6]   Hyperspectral Imaging in the Medical Field: Present and Future [J].
Calin, Mihaela Antonina ;
Parasca, Sorin Viorel ;
Savastru, Dan ;
Manea, Dragos .
APPLIED SPECTROSCOPY REVIEWS, 2014, 49 (06) :435-447
[7]   SpectralDiff: A Generative Framework for Hyperspectral Image Classification With Diffusion Models [J].
Chen, Ning ;
Yue, Jun ;
Fang, Leyuan ;
Xia, Shaobo .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[8]   Cyclic Cross-Modality Interaction for Hyperspectral and Multispectral Image Fusion [J].
Chen, Shi ;
Zhang, Lefei ;
Zhang, Liangpei .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (01) :741-753
[9]  
Chen S, 2023, IEEE T GEOSCI REMOTE, V61, DOI [10.1109/tgrs.2023.3321255, 10.1109/TGRS.2023.3315970]
[10]   Dual Aggregation Transformer for Image Super-Resolution [J].
Chen, Zheng ;
Zhang, Yulun ;
Gu, Jinjin ;
Kong, Linghe ;
Yang, Xiaokang ;
Yu, Fisher .
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, :12278-12287