Simultaneous Learning Intensity and Optical Flow From High-Speed Spike Stream

被引:0
作者
Zhu, Lin [1 ]
Yan, Weiquan [2 ]
Chang, Yi [3 ]
Tian, Yonghong [4 ]
Huang, Hua [5 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518055, Peoples R China
[3] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[4] Peking Univ, Sch Comp Sci, Beijing 100081, Peoples R China
[5] Beijing Normal Univ, Sch Artificial Intelligence, Beijing 100875, Peoples R China
基金
中国国家自然科学基金;
关键词
Image reconstruction; Optical flow; Streaming media; Cameras; Feature extraction; Estimation; Vision sensors; Neurons; Image motion analysis; Decoding; optical flow estimation; bio-inspired camera; spiking neural network; EVENT CAMERAS; RECONSTRUCTION; NETWORKS;
D O I
10.1109/TCSVT.2024.3516478
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Bio-inspired vision sensors, which emulate the human retina by recording light intensity as binary spikes, have gained increasing interest in recent years. Among them, the spike camera is capable of perceiving fine textures by simulating a small retinal region called the fovea and producing high temporal resolution (20,000 Hz) spatiotemporal spike streams. To bridge the gap between binary spike streams and human vision in high-speed scenes, reconstructing intensity and optical flow from high temporal resolution spikes is particularly important. In this paper, we present a hybrid SNN-ANN network designed for simultaneous intensity and optical flow learning from spike streams. To adaptively extract spatial and temporal features from continuous spike streams, we propose a spiking neuron module with dense connections that efficiently processes both short-term and long-term spike data, while maintaining low power consumption characteristics. Subsequently, we introduce two decoders for optical flow and intensity estimation that complement each other. A temporal-aware warping module, based on flow features, is specifically designed to align the temporal features of the intensity decoder, thereby reducing motion artifacts. Concurrently, improved intensity features contribute to more accurate flow feature predictions, resulting in a mutually beneficial relationship within our network. To evaluate the effectiveness of our proposed network, we conduct experiments on both simulated and real spike datasets. Our network outperforms existing state-of-the-art spike-based reconstruction and optical flow estimation methods, demonstrating its potential for advancing the field of bio-inspired vision sensors. Our code is available at https://github.com/LinZhu111/SLIO.
引用
收藏
页码:5126 / 5139
页数:14
相关论文
共 57 条
[31]  
Scheerlinck C, 2020, IEEE WINT CONF APPL, P156, DOI 10.1109/WACV45572.2020.9093366
[32]   Continuous-Time Intensity Estimation Using Event Cameras [J].
Scheerlinck, Cedric ;
Barnes, Nick ;
Mahony, Robert .
COMPUTER VISION - ACCV 2018, PT V, 2019, 11365 :308-324
[33]   Reducing the Sim-to-Real Gap for Event Cameras [J].
Stoffregen, Timo ;
Scheerlinck, Cedric ;
Scaramuzza, Davide ;
Drummond, Tom ;
Barnes, Nick ;
Kleeman, Lindsay ;
Mahony, Robert .
COMPUTER VISION - ECCV 2020, PT XXVII, 2020, 12372 :534-549
[34]   PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume [J].
Sun, Deqing ;
Yang, Xiaodong ;
Liu, Ming-Yu ;
Kautz, Jan .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :8934-8943
[35]   Event-based Video Reconstruction Using Transformer [J].
Weng, Wenming ;
Zhang, Yueyi ;
Xiong, Zhiwei .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :2543-2552
[36]   BACKPROPAGATION THROUGH TIME - WHAT IT DOES AND HOW TO DO IT [J].
WERBOS, PJ .
PROCEEDINGS OF THE IEEE, 1990, 78 (10) :1550-1560
[37]   LIAF-Net: Leaky Integrate and Analog Fire Network for Lightweight and Efficient Spatiotemporal Information Processing [J].
Wu, Zhenzhi ;
Zhang, Hehui ;
Lin, Yihan ;
Li, Guoqi ;
Wang, Meng ;
Tang, Ye .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (11) :6249-6262
[38]  
Chang AX, 2015, Arxiv, DOI [arXiv:1512.03012, DOI 10.48550/ARXIV.1512.03012]
[39]  
Xia Lujie, 2023, Advances in Neural Information Processing Systems
[40]   Learning Super-Resolution Reconstruction for High Temporal Resolution Spike Stream [J].
Xiang, Xijie ;
Zhu, Lin ;
Li, Jianing ;
Wang, Yixuan ;
Huang, Tiejun ;
Tian, Yonghong .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (01) :16-29