DMDC: a cross-attention network for dynamic mask-based dual-camera snapshot hyperspectral Photography

被引：1

作者：

Cai, Zeyu ^{[1
,4
]}

Zhang, Ziyu ^{[2
]}

Jin, Chengqian ^{[3
]}

Da, Feipeng ^{[1
,4
]}

机构：

[1] Southeast Univ, Sch Automat, Nanjing 210000, Peoples R China

[2] Nanjing Univ, Med Sch, Nanjing 210000, Peoples R China

[3] Chinese Acad Agr Sci, NRIAM, Nanjing 210000, Peoples R China

[4] Minist Educ, Key Lab Measurement & Control CSE, Nanjing 210000, Peoples R China

来源：

VISUAL COMPUTER | 2025年 / 41卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Hyperspectral-imaging; RGB; CASSI; Dynamic; Cross-attention; ACQUISITION; DESIGN; VIDEO;

D O I：

10.1007/s00371-024-03700-z

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Spectral images can enrich the material information of the reconstructed scene and have an essential role in computer visualization. Coded aperture snapshot spectral imaging (CASSI) for dynamic scenes still faces two problems in reconstruction: 1) The single input limits the network's performance. 2) The spatial light modulator's (SLM) performance has yet to fully develop due to the limitation of fixed mask coding. This paper proposes a cross-attention-based dual-stream network for a dual-camera CASSI system. We argue that RGB images and CASSI measurements are projections of spectral 3D cubes in different 2D spaces and that fusing the spatial features of RGB and the spectral features of CASSI improves the quality of the reconstruction. Upon that, we embed a dynamic mask module in front of the cross-attention-based dual-stream network to further improve the reconstruction quality of the system. Specifically, the dynamic mask module utilizes RGB images to pre-learn the spatial feature distribution of the scene. Then the dynamic mask module guides the SLM in encoding the CASSI. Finally, the RGB and CASSI images are reconstructed using a cross-attention-based dual-stream network to obtain high-quality reconstruction results. Comprehensive experiments on various datasets demonstrate the superior performance of our method. At similar speeds, our method provides a 4.0 dB improvement over existing SOTA methods on clean and noisy datasets. In the snapshot video imaging task, the single snapshot imaging time of DMDC-1stg is less than 50 ms, which verifies the feasibility of our method. (The code has been released at https://github.com/caizeyu1992/DMDC.)

引用

页码：4957 / 4974

页数：18

共 59 条

[1] NTIRE 2022 Spectral Recovery Challenge and Data Set [J].

Arad, Boaz ;

Timofte, Radu ;

Yahel, Rony ;

Morag, Nimrod ;

Bernat, Amir ;

Cai, Yuanhao ;

Lin, Jing ;

Lin, Zudi ;

Wang, Haoqian ;

Zhang, Yulun ;

Pfister, Hanspeter ;

Van Gool, Luc ;

Liu, Shuai ;

Li, Yongqiang ;

Feng, Chaoyu ;

Lei, Lei ;

Li, Jiaojiao ;

Du, Songcheng ;

Wu, Chaoxiong ;

Leng, Yihong ;

Song, Rui ;

Zhang, Mingwei ;

Song, Chongxing ;

Zhao, Shuyi ;

Lang, Zhiqiang ;

Wei, Wei ;

Zhang, Lei ;

Dian, Renwei ;

Shan, Tianci ;

Guo, Anjing ;

Feng, Chengguo ;

Liu, Jinyang ;

Agarla, Mirko ;

Bianco, Simone ;

Buzzelli, Marco ;

Celona, Luigi ;

Schettini, Raimondo ;

He, Jiang ;

Xiao, Yi ;

Xiao, Jiajun ;

Yuan, Qiangqiang ;

Li, Jie ;

Zhang, Liangpei ;

Kwon, Taesung ;

Ryu, Dohoon ;

Bae, Hyokyoung ;

Yang, Hao-Hsiang ;

Chang, Hua-En ;

Huang, Zhi-Kai ;

Chen, Wei-Ting .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, :862-880

[2] Colored Coded Aperture Design by Concentration of Measure in Compressive Spectral Imaging [J].

Arguello, Henry ;

Arce, Gonzalo R. .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (04) :1896-1908

[3] A new TwIST: Two-step iterative shrinkage/thresholding algorithms for image restoration [J].

Bioucas-Dias, Jose M. ;

Figueiredo, Mario A. T. .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2007, 16 (12) :2992-3004

[4]

Cai Y, 2022, ADV NEUR IN

[5] Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction [J].

Cai, Yuanhao ;

Lin, Jing ;

Hu, Xiaowan ;

Wang, Haoqian ;

Yuan, Xin ;

Zhang, Yulun ;

Timofte, Radu ;

Van Gool, Luc .

COMPUTER VISION - ECCV 2022, PT XVII, 2022, 13677 :686-704

[6] Mask-guided Spectral-wise Transformer for Efficient Hyperspectral Image Reconstruction [J].

Cai, Yuanhao ;

Lin, Jing ;

Hu, Xiaowan ;

Wang, Haoqian ;

Yuan, Xin ;

Zhang, Yulun ;

Timofte, Radu ;

Van Gool, Luc .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :17481-17490

[7] MST plus plus : Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction [J].

Cai, Yuanhao ;

Lin, Jing ;

Lin, Zudi ;

Wang, Haoqian ;

Zhang, Yulun ;

Pfister, Hanspeter ;

Timofte, Radu ;

Van Gool, Luc .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, :744-754

[8] Reversible-Prior-Based Spectral-Spatial Transformer for Efficient Hyperspectral Image Reconstruction [J].

Cai, Zeyu ;

Liu, Zheng ;

Yu, Jian ;

Zhang, Ziyu ;

Da, Feipeng ;

Jin, Chengqian .

INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2024, 20 (01)

[9] Computational Snapshot Multispectral Cameras Toward dynamic capture of the spectral world [J].

Cao, Xun ;

Yue, Tao ;

Lin, Xing ;

Lin, Stephen ;

Yuan, Xin ;

Dai, Qionghai ;

Carin, Lawrence ;

Brady, David J. .

IEEE SIGNAL PROCESSING MAGAZINE, 2016, 33 (05) :95-108

[10] High Resolution Multispectral Video Capture with a Hybrid Camera System [J].

Cao, Xun ;

Tong, Xin ;

Dai, Qionghai ;

Lin, Stephen .

2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, :297-304

← 1 2 3 4 5 6 →