End-to-End Learning of Representations for Asynchronous Event-Based Data

被引:211
作者
Gehrig, Daniel [1 ,2 ,3 ]
Loquercio, Antonio [1 ,2 ,3 ]
Derpanis, Konstantinos G. [4 ,5 ]
Scaramuzza, Davide [1 ,2 ,3 ]
机构
[1] Univ Zurich, Dept Informat, Robot & Percept Grp, Zurich, Switzerland
[2] Univ Zurich, Dept Neuroinformat, Robot & Percept Grp, Zurich, Switzerland
[3] Swiss Fed Inst Technol, Zurich, Switzerland
[4] Ryerson Univ, Toronto, ON, Canada
[5] Samsung AI Ctr Toronto, Toronto, ON, Canada
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
基金
瑞士国家科学基金会; 加拿大自然科学与工程研究理事会;
关键词
VISION; FEATURES;
D O I
10.1109/ICCV.2019.00573
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Event cameras are vision sensors that record asynchronous streams of per-pixel brightness changes, referred to as "events". They have appealing advantages over frame-based cameras for computer vision, including high temporal resolution, high dynamic range, and no motion blur. Due to the sparse, non-uniform spatiotemporal layout of the event signal, pattern recognition algorithms typically aggregate events into a grid-based representation and subsequently process it by a standard vision pipeline, e.g., Convolutional Neural Network (CNN). In this work, we introduce a general framework to convert event streams into grid-based representations through a sequence of differentiable operations. Our framework comes with two main advantages: (i) allows learning the input event representation together with the task dedicated network in an end-to-end manner, and (ii) lays out a taxonomy that unifies the majority of extant event representations in the literature and identifies novel ones. Empirically, we show that our approach to learning the event representation end-to-end yields an improvement of approximately 12% on optical flow estimation and object recognition over state-of-the-art methods.
引用
收藏
页码:5632 / 5642
页数:11
相关论文
共 69 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] A Low Power, Fully Event-Based Gesture Recognition System
    Amir, Arnon
    Taba, Brian
    Berg, David
    Melano, Timothy
    McKinstry, Jeffrey
    Di Nolfo, Carmelo
    Nayak, Tapan
    Andreopoulos, Alexander
    Garreau, Guillaume
    Mendoza, Marcela
    Kusnitz, Jeff
    Debole, Michael
    Esser, Steve
    Delbruck, Tobi
    Flickner, Myron
    Modha, Dharmendra
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 7388 - 7397
  • [3] A Low Power, High Throughput, Fully Event-Based Stereo System
    Andreopoulos, Alexander
    Kashyap, Hirak J.
    Nayak, Tapan K.
    Amir, Arnon
    Flickner, Myron D.
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7532 - 7542
  • [4] [Anonymous], 2017, IEEE C COMP VIS PATT
  • [5] [Anonymous], 2018, ADV NEURAL INFORM PR
  • [6] [Anonymous], 2015, INT JOINT C NEUR NET
  • [7] [Anonymous], 2014, ADV NEURAL INFORM PR
  • [8] [Anonymous], 2015, ISPRS WORKSH IM SEQ
  • [9] [Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.102
  • [10] [Anonymous], 2018, EUR C COMP VIS ECCV