End-to-End Low Cost Compressive Spectral Imaging with Spatial-Spectral Self-Attention

被引:181
作者
Meng, Ziyi [1 ,2 ]
Ma, Jiawei [3 ]
Yuan, Xin [4 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100876, Peoples R China
[2] New Jersey Inst Technol, Newark, NJ 07102 USA
[3] Columbia Univ, New York, NY 10027 USA
[4] Nokia Bell Labs, Murray Hill, NJ 07974 USA
来源
COMPUTER VISION - ECCV 2020, PT XXIII | 2020年 / 12368卷
关键词
Compressive spectral imaging; Spatial-Spectral Self-Attention; Large-scale real data; RECONSTRUCTION; VIDEO; DESIGN; NOISE; MODEL;
D O I
10.1007/978-3-030-58592-1_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Coded aperture snapshot spectral imaging (CASSI) is an effective tool to capture real-world 3D hyperspectral images. While a number of existing work has been conducted for hardware and algorithm design, we make a step towards the low-cost solution that enjoys video-rate high-quality reconstruction. To make solid progress on this challenging yet under-investigated task, we reproduce a stable single disperser (SD) CASSI system to gather large-scale real-world CASSI data and propose a novel deep convolutional network to carry out the real-time reconstruction by using self-attention. In order to jointly capture the self-attention across different dimensions in hyperspectral images (i.e., channel-wise spectral correlation and non-local spatial regions), we propose Spatial-Spectral Self-Attention (TSA) to process each dimension sequentially, yet in an order-independent manner. We employ TSA in an encoder-decoder network, dubbed TSA-Net, to reconstruct the desired 3D cube. Furthermore, we investigate how noise affects the results and propose to add shot noise in model training, which improves the real data results significantly. We hope our large-scale CASSI data serve as a benchmark in future research and our TSA model as a baseline in deep learning based reconstruction algorithms. Our code and data are available at https://github.com/mengziyi64/TSA-Net.
引用
收藏
页码:187 / 204
页数:18
相关论文
共 72 条
[1]   Hyperspectral Recovery from RGB Images using Gaussian Processes [J].
Akhtar, Naveed ;
Mian, Ajmal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (01) :100-113
[2]   Higher-order computational model for coded aperture spectral imaging [J].
Arguello, Henry ;
Rueda, Hoover ;
Wu, Yuehao ;
Prather, Dennis W. ;
Arce, Gonzalo R. .
APPLIED OPTICS, 2013, 52 (10) :D12-D21
[3]   A new TwIST: Two-step iterative shrinkage/thresholding algorithms for image restoration [J].
Bioucas-Dias, Jose M. ;
Figueiredo, Mario A. T. .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2007, 16 (12) :2992-3004
[4]   TRAINING WITH NOISE IS EQUIVALENT TO TIKHONOV REGULARIZATION [J].
BISHOP, CM .
NEURAL COMPUTATION, 1995, 7 (01) :108-116
[5]   Shot noise in mesoscopic conductors [J].
Blanter, YM ;
Büttiker, M .
PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS, 2000, 336 (1-2) :1-166
[6]   Robust uncertainty principles:: Exact signal reconstruction from highly incomplete frequency information [J].
Candès, EJ ;
Romberg, J ;
Tao, T .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (02) :489-509
[7]   A Prism-Mask System for Multispectral Video Acquisition [J].
Cao, Xun ;
Du, Hao ;
Tong, Xin ;
Dai, Qionghai ;
Lin, Stephen .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (12) :2423-2435
[8]   BIRNAT: Bidirectional Recurrent Neural Networks with Adversarial Training for Video Snapshot Compressive Imaging [J].
Cheng, Ziheng ;
Lu, Ruiying ;
Wang, Zhengjue ;
Zhang, Hao ;
Chen, Bo ;
Meng, Ziyi ;
Yuan, Xin .
COMPUTER VISION - ECCV 2020, PT XXIV, 2020, 12369 :258-275
[9]   High-Quality Hyperspectral Reconstruction Using a Spectral Prior [J].
Choi, Inchang ;
Jeon, Daniel S. ;
Nam, Giljoo ;
Gutierrez, Diego ;
Kim, Min H. .
ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (06)
[10]   Compressed sensing [J].
Donoho, DL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (04) :1289-1306