Selection and Cross Similarity for Event-Image Deep Stereo

被引:16
作者
Cho, Hoonhee [1 ]
Yoon, Kuk-Jin [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Daejeon, South Korea
来源
COMPUTER VISION - ECCV 2022, PT XXXII | 2022年 / 13692卷
基金
新加坡国家研究基金会;
关键词
Event cameras; Stereo depth; Multi-modal fusion;
D O I
10.1007/978-3-031-19824-3_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Standard frame-based cameras have shortcomings of low dynamic range and motion blur in real applications. On the other hand, event cameras, which are bio-inspired sensors, asynchronously output the polarity values of pixel-level log intensity changes and report continuous stream data even under fast motion with a high dynamic range. Therefore, event cameras are effective in stereo depth estimation under challenging illumination conditions and/or fast motion. To estimate the disparity map with events, existing state-of-the-art event-based stereo models use the image together with past events that occurred up to the current image acquisition time. However, not all events equally contribute to the disparity estimation of the current frame since past events occur at different times under different movements with different disparity values. Therefore, events need to be carefully selected for accurate event-guided disparity estimation. In this paper, we aim to effectively deal with events that continuously occur with different disparity values in the scene depending on the camera's movement. To this end, we first propose the differentiable event selection network to select the most relevant events for current depth estimation. Furthermore, we effectively use feature-like events triggered around the boundary of objects, leading them to serve as ideal guides in disparity estimation. To this end, we propose a neighbor cross similarity feature (NCSF) that considers the similarity between different modalities. Finally, our experiments on various datasets demonstrate the superiority of our method to estimate the depth using images and event data together. Our project code is available at: https://github.com/Chohoonhee/SCSNet.
引用
收藏
页码:470 / 486
页数:17
相关论文
共 42 条
[1]  
Abernethy J, 2016, NEURAL INF PROCESS S, P233
[2]  
Ahmed SH, 2021, AAAI CONF ARTIF INTE, V35, P882
[3]  
Berthet Q., 2020, Advances in neural information processing systems, V33, P9508
[4]   A 240 x 180 130 dB 3 μs Latency Global Shutter Spatiotemporal Vision Sensor [J].
Brandli, Christian ;
Berner, Raphael ;
Yang, Minhao ;
Liu, Shih-Chii ;
Delbruck, Tobi .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2014, 49 (10) :2333-2341
[5]   On the use of orientation filters for 3D reconstruction in event-driven stereo vision [J].
Camunas-Mesa, Luis A. ;
Serrano-Gotarredona, Teresa ;
Ieng, Sio H. ;
Benosman, Ryad B. ;
Linares-Barranco, Bernabe .
FRONTIERS IN NEUROSCIENCE, 2014, 8
[6]   Pyramid Stereo Matching Network [J].
Chang, Jia-Ren ;
Chen, Yong-Sheng .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5410-5418
[7]  
Cheng XL, 2020, Arxiv, DOI arXiv:2010.13501
[8]   EOMVS: Event-Based Omnidirectional Multi-View Stereo [J].
Cho, Hoonhee ;
Jeong, Jaeseok ;
Yoon, Kuk-Jin .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (04) :6709-6716
[9]   DSEC: A Stereo Event Camera Dataset for Driving Scenarios [J].
Gehrig, Mathias ;
Aarents, Willem ;
Gehrig, Daniel ;
Scaramuzza, Davide .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (03) :4947-4954
[10]  
Gumbel EJ., 1954, Statistical theory of extreme values and some practical applications. Applied mathematics series, V33