SpikeODE: Image Reconstruction for Spike Camera With Neural Ordinary Differential Equation

被引：1

作者：

Yang, Chen ^{[1
]}

Li, Guorong ^{[1
]}

Wang, Shuhui ^{[2
]}

Su, Li ^{[1
]}

Qing, Laiyun ^{[1
]}

Huang, Qingming ^{[1
]}

机构：

[1] Univ Chinese Acad Sci, Key Lab Big Data Min & Knowledge Management, Sch Comp Sci & Technol, Beijing 100049, Peoples R China

[2] Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 11期

基金：

中国国家自然科学基金;

关键词：

Spike camera; image reconstruction; neural ordinary differential equation; temporal-spatial correlation;

D O I：

10.1109/TCSVT.2024.3417812

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The recently invented retina-inspired spike camera has shown great potential for capturing dynamic scenes. However, reconstructing high-quality images from the binary spike data remains a challenge due to the existence of noises in the camera. This paper proposes SpikeODE, a novel approach to reconstructing clear images by exploring temporal-spatial correlation to depress noises. The main idea of our method is to restore the continuous dynamic process of real scenes in a latent space and learn the temporal correlations in a fine-grained manner. Furthermore, to model the dynamic process more effectively, we design a conditional ODE where the latent state of each timestamp is conditioned on the observed spike data. Subsequently, forward and backward inferences are conducted through the ODE to investigate the correlations between the representation of the target timestamp and the information from both past and future contexts. Additionally, we incorporate a Unet structure with a pixel-wise attention mechanism at each level to learn spatial correlations. Experimental results demonstrate that our method outperforms state-of-the-art methods across several metrics.

引用

页码：11142 / 11155

页数：14

共 44 条

[41] Jiang H., Sun D., Jampani V., Yang M., Learned-Miller E., Kautz J., Super SloMo: High quality estimation of multiple intermediate frames for video interpolation, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit, pp. 9000-9008, (2018)
[42] Zhang R., Isola P., Efros A.A., Shechtman E., Wang O., The unreasonable effectiveness of deep features as a perceptual metric, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit, pp. 586-595, (2018)
[43] Su S., Et al., Blindly assess image quality in the wild guided by a selfadaptive hyper network, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit, pp. 3667-3676, (2020)
[44] Yang S., Et al., MANIQA: Multi-dimension attention network for noreference image quality assessment, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), pp. 1191-1200, (2022)

← 1 2 3 4 5 →