Spatial-spectral Collaborative Unrolling Network for Pansharpening

被引：0

作者：

Zheng, Jianwei ^{[1
]}

Xia, Hongyi ^{[1
]}

Xu, Honghui ^{[1
]}

机构：

[1] ZheJiang Univ Technol, Sch Comp Sci & Technol, Hangzhou 310023, Peoples R China

来源：

ACTA PHOTONICA SINICA | 2025年 / 54卷 / 01期

关键词：

Remote sensing image; Pansharpening; Deep learning; Unrolling network; Transformer; Multi-scale convolution; Feature interaction;

D O I：

10.3788/gzxb20255401.0110003

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

To address the limitations inherent in physical device acquisition, pansharpening offers a computational alternative. This process aims to enhance the spatial resolution of Low-Resolution Multispectral Images (LRMS) by integrating textural information from Panchromatic (PAN) images, thereby generating High-Resolution Multispectral images (HRMS). Recently, a growing number of deep learning-based methods, leveraging their enhanced feature extraction capabilities, have been introduced, demonstrating exceptional results in improving fusion quality. However, many of these methods continue to exhibit two notable shortcomings. For one thing, the universally adopted black-box principle limits the model interpretability. For another thing, existing DL-based methods fail to efficiently capture local-and- global dependencies at the same time, inevitably limiting the overall performance. By gathering the merits of nonlinear network architectures and interpretable optimization schemes, Deep Unfolding Network (DUN) has shed new light on pansharpening. However, current DUNs lack a dedicated design for both estimating the degradation matrices and extracting intricate information from the proximal operator. To address the conundrums, we propose a novel Spatial-Spectral Collaborative Unrolling Network (SCUN). An alternating optimization-based Half-Quadratic Splitting (HQS) is practiced to solve the resulting model, giving rise to an elementary iteration mechanism. Under the guidance of iterative optimization theory, this network achieves Adaptive Degradation Matrix Estimation (ADME) and spatial-spectral prior operator learning through multi-scale cascade strategies, point convolution operations, and Transformer technology. During the ADME step, the overall estimation undergoes an end-to-end iterative block, allowing for adaptive modeling of complex spatial and spectral structures. On that basis, we employ customized multiscale convolution and point convolution to simulate the degradation processes of both spatial and spectral degradation matrices. Moreover, the proposed convolution method is reassigned in each unfolding iteration, endowing it with a highly adaptive capability. To address the limitations of prior operators, we propose a collaborative complementary mechanism that enables the approximation of operators and facilitates the joint exploration and acquisition of global-local and spatial-spectral features. This is achieved through a combination of convolutional layers and attention mechanisms. The entire prior module is designed as a U-shaped architecture network, following the process of "embedding-encoder- bottleneck layer-decoder-deembedding" to extract refined feature representations. Initially, the intermediate variables are processed through an embedding layer, which segments them into non- overlapping patch markers. These patch markers are then fed into two Spatial-Spectral Collaborative Modules (SSCMs) and a bottleneck layer consisting of a single SSCM to explore comprehensive properties. Each SSCM is composed of three key components, including Spatial-Spectral Collaborative Attention (SSCA), Scale-Aware Channel Collaboration (SACC), and Mixed-Scale Feed-forward Layer (MSFL). Specifically, the SSCA subassembly includes two Transformer blocks. The first is the Spatial Transformer Block, which primarily transfers high-frequency texture features from PAN images to HRMS. The second is the Spectral Transformer Block, which focuses on transferring spectral features from LRMS to HRMS images. After extracting these two attention features, a multi-head self-attention mechanism is further applied to deeply fuse the spatial and spectral information, thereby achieving enhanced collaboration and complementarity of the target information. Within SACC, we dynamically assimilate and cross-converge characteristics originating from size-varied receptive fields via multiscale convolution, while simultaneously introducing channel attention to model the spectral dependency of MSIs. Similarly, to amplify the nonlinear feature transformation stemming from attention layers, our MSFL incorporates a mixed-scale strategy and subsequently a cross-complementary mechanism is introduced to emphasize the important components of the multiscale convolutions. With all modules organically assembled, the final proposal stands out as the initial attempt to systematically capture local-global and spatial-spectral information during model unfolding, guaranteeing an appealing pansharpening performance. Experimental results on multiple remote sensing datasets demonstrate that the proposed method outperforms comparative methods, achieving a PSNR gain of 0.798 dB on the GF-2 dataset.

引用

页数：13

共 25 条

[1] Improving component substitution pansharpening through multivariate regression of MS plus Pan data
Aiazzi, Bruno
Baronti, Stefano
Selva, Massimo
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2007, 45 (10): : 3230 - 3239
[2] HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening
Bandara, Wele Gedara Chaminda
Patel, Vishal M.
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1757 - 1767
[3] PanCSC-Net: A Model-Driven Deep Unfolding Method for Pansharpening
Cao, Xiangyong
Fu, Xueyang
Hong, Danfeng
Xu, Zongben
Meng, Deyu
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[4] Chen ZX, 2022, PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, P841
[5] [邓良剑 Deng Liangjian], 2023, [中国图象图形学报, Journal of Image and Graphics], V28, P57
[6] Sparse self-attention transformer for image inpainting
Huang, Wenli
Deng, Ye
Hui, Siqi
Wu, Yang
Zhou, Sanping
Wang, Jinjun
[J]. PATTERN RECOGNITION, 2024, 145
[7] King RL, 2001, INT GEOSCI REMOTE SE, P849, DOI 10.1109/IGARSS.2001.976657
[8] [李妙宇 Li Miaoyu], 2023, [中国图象图形学报, Journal of Image and Graphics], V28, P3922
[9] Li MS, 2023, PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, P1071
[10] Remote Sensing Image Fusion Method Based on Improved Swin Transformer
Li Zitong
Zhao Jiankang
Xu Jingran
Long Haihui
Liu Chuanqi
[J]. ACTA PHOTONICA SINICA, 2023, 52 (11)

← 1 2 3 →