SpikeODE: Image Reconstruction for Spike Camera With Neural Ordinary Differential Equation

被引:1
作者
Yang, Chen [1 ]
Li, Guorong [1 ]
Wang, Shuhui [2 ]
Su, Li [1 ]
Qing, Laiyun [1 ]
Huang, Qingming [1 ]
机构
[1] Univ Chinese Acad Sci, Key Lab Big Data Min & Knowledge Management, Sch Comp Sci & Technol, Beijing 100049, Peoples R China
[2] Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Spike camera; image reconstruction; neural ordinary differential equation; temporal-spatial correlation;
D O I
10.1109/TCSVT.2024.3417812
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The recently invented retina-inspired spike camera has shown great potential for capturing dynamic scenes. However, reconstructing high-quality images from the binary spike data remains a challenge due to the existence of noises in the camera. This paper proposes SpikeODE, a novel approach to reconstructing clear images by exploring temporal-spatial correlation to depress noises. The main idea of our method is to restore the continuous dynamic process of real scenes in a latent space and learn the temporal correlations in a fine-grained manner. Furthermore, to model the dynamic process more effectively, we design a conditional ODE where the latent state of each timestamp is conditioned on the observed spike data. Subsequently, forward and backward inferences are conducted through the ODE to investigate the correlations between the representation of the target timestamp and the information from both past and future contexts. Additionally, we incorporate a Unet structure with a pixel-wise attention mechanism at each level to learn spatial correlations. Experimental results demonstrate that our method outperforms state-of-the-art methods across several metrics.
引用
收藏
页码:11142 / 11155
页数:14
相关论文
共 44 条
  • [1] Dong S., Huang T., Tian Y., Spike camera and its coding methods, Proc. Data Compress. Conf. (DCC), pp. 1-437, (2017)
  • [2] Dong S., Zhu L., Xu D., Tian Y., Huang T., An efficient coding method for spike camera using inter-spike intervals, Proc. Data Compress. Conf. (DCC), pp. 1-568, (2019)
  • [3] Lichtsteiner P., Posch C., Delbruck T., A 128128 120 dB 15 s latency asynchronous temporal contrast vision sensor, IEEE J. Solid-State Circuits, 43, 2, pp. 566-576, (2008)
  • [4] Serrano-Gotarredona T., Linares-Barranco B., A 128128 1.5% contrast sensitivity 0.9% FPN 3 s latency 4 mW asynchronous framefree dynamic vision sensor using transimpedance preamplifiers, IEEE J. Solid-State Circuits, 48, 3, pp. 827-838, (2013)
  • [5] Nie K., Shi X., Cheng S., Gao Z., Xu J., High frame rate video reconstruction and deblurring based on dynamic and active pixel vision image sensor, IEEE Trans. Circuits Syst. Video Technol, 31, 8, pp. 2938-2952, (2021)
  • [6] Liu D., Wang T., Sun C., Voxel-based multi-scale transformer network for event stream processing, IEEE Trans. Circuits Syst. Video Technol, 34, 4, pp. 2112-2124, (2024)
  • [7] Ke J., Wang Q., Wang Y., Milanfar P., Yang F., MUSIQ: Multiscale image quality transformer, Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), pp. 5148-5157, (2021)
  • [8] Zhu L., Dong S., Huang T., Tian Y., A retina-inspired sampling method for visual texture reconstruction, Proc. IEEE Int. Conf. Multimedia Expo. (ICME), pp. 1432-1437, (2019)
  • [9] Zhao J., Xiong R., Liu H., Zhang J., Huang T., Spk2ImgNet: Learning to reconstruct dynamic scene from continuous spike stream, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 11991-12000, (2021)
  • [10] Zhu L., Li J., Wang X., Huang T., Tian Y., NeuSpike-Net: High speed video reconstruction via bio-inspired neuromorphic cameras, Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), pp. 2380-2389, (2021)