End-to-End Learning of Representations for Asynchronous Event-Based Data

被引:248
作者
Gehrig, Daniel [1 ,2 ,3 ]
Loquercio, Antonio [1 ,2 ,3 ]
Derpanis, Konstantinos G. [4 ,5 ]
Scaramuzza, Davide [1 ,2 ,3 ]
机构
[1] Univ Zurich, Dept Informat, Robot & Percept Grp, Zurich, Switzerland
[2] Univ Zurich, Dept Neuroinformat, Robot & Percept Grp, Zurich, Switzerland
[3] Swiss Fed Inst Technol, Zurich, Switzerland
[4] Ryerson Univ, Toronto, ON, Canada
[5] Samsung AI Ctr Toronto, Toronto, ON, Canada
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
基金
瑞士国家科学基金会; 加拿大自然科学与工程研究理事会;
关键词
VISION; FEATURES;
D O I
10.1109/ICCV.2019.00573
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Event cameras are vision sensors that record asynchronous streams of per-pixel brightness changes, referred to as "events". They have appealing advantages over frame-based cameras for computer vision, including high temporal resolution, high dynamic range, and no motion blur. Due to the sparse, non-uniform spatiotemporal layout of the event signal, pattern recognition algorithms typically aggregate events into a grid-based representation and subsequently process it by a standard vision pipeline, e.g., Convolutional Neural Network (CNN). In this work, we introduce a general framework to convert event streams into grid-based representations through a sequence of differentiable operations. Our framework comes with two main advantages: (i) allows learning the input event representation together with the task dedicated network in an end-to-end manner, and (ii) lays out a taxonomy that unifies the majority of extant event representations in the literature and identifies novel ones. Empirically, we show that our approach to learning the event representation end-to-end yields an improvement of approximately 12% on optical flow estimation and object recognition over state-of-the-art methods.
引用
收藏
页码:5632 / 5642
页数:11
相关论文
共 69 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]   A Low Power, Fully Event-Based Gesture Recognition System [J].
Amir, Arnon ;
Taba, Brian ;
Berg, David ;
Melano, Timothy ;
McKinstry, Jeffrey ;
Di Nolfo, Carmelo ;
Nayak, Tapan ;
Andreopoulos, Alexander ;
Garreau, Guillaume ;
Mendoza, Marcela ;
Kusnitz, Jeff ;
Debole, Michael ;
Esser, Steve ;
Delbruck, Tobi ;
Flickner, Myron ;
Modha, Dharmendra .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :7388-7397
[3]   A Low Power, High Throughput, Fully Event-Based Stereo System [J].
Andreopoulos, Alexander ;
Kashyap, Hirak J. ;
Nayak, Tapan K. ;
Amir, Arnon ;
Flickner, Myron D. .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7532-7542
[4]  
[Anonymous], 2017, IEEE C COMP VIS PATT
[5]  
[Anonymous], 2018, ADV NEURAL INFORM PR
[6]  
[Anonymous], 2015, INT JOINT C NEUR NET
[7]  
[Anonymous], 2014, ADV NEURAL INFORM PR
[8]  
[Anonymous], 2015, ISPRS WORKSH IM SEQ
[9]  
[Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.102
[10]  
[Anonymous], 2018, EUR C COMP VIS ECCV