HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition

被引:383
作者
Lagorce, Xavier [1 ,2 ,3 ]
Orchard, Garrick [4 ]
Galluppi, Francesco [1 ,2 ,3 ]
Shi, Bertram E. [5 ,6 ]
Benosman, Ryad B. [1 ,2 ,3 ]
机构
[1] Inst Natl Sante & Rech Med, Vis & Nat Computat Grp, F-75012 Paris, France
[2] Univ Paris 06, Inst Vis, Sorbonne Univ, F-75012 Paris, France
[3] CNRS, F-75012 Paris, France
[4] Natl Univ Singapore, Singapore Inst Neurotechnol SINAPSE, Singapore 119077, Singapore
[5] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Kowloon, Hong Kong, Peoples R China
[6] Hong Kong Univ Sci & Technol, Div Biomed Engn, Kowloon, Hong Kong, Peoples R China
关键词
Neuromorphic sensing; event-based vision; feature extraction; DRIVEN; SENSOR;
D O I
10.1109/TPAMI.2016.2574707
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.
引用
收藏
页码:1346 / 1359
页数:14
相关论文
共 39 条
[1]   What Can Neuromorphic Event-Driven Precise Timing Add to Spike-Based Pattern Recognition? [J].
Akolkar, Himanshu ;
Meyer, Cedric ;
Clady, Zavier ;
Marre, Olivier ;
Bartolozzi, Chiara ;
Panzeri, Stefano ;
Benosman, Ryad .
NEURAL COMPUTATION, 2015, 27 (03) :561-593
[2]  
Andreou A. G., 1991, ADV NEURAL INFORMATI, P764
[3]  
[Anonymous], 1943, Bull Calcutta Math Soc, DOI DOI 10.1038/157869B0
[4]  
[Anonymous], 2012, NIPS
[5]   A 3.6 μs Latency Asynchronous Frame-Free Event-Driven Dynamic-Vision-Sensor [J].
Antonio Lenero-Bardallo, Juan ;
Serrano-Gotarredona, Teresa ;
Linares-Barranco, Bernabe .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2011, 46 (06) :1443-1455
[6]   Mapping from Frame-Driven to Frame-Free Event-Driven Vision Systems by Low-Rate Rate Coding and Coincidence Processing-Application to Feedforward ConvNets [J].
Antonio Perez-Carrasco, Jose ;
Zhao, Bo ;
Serrano, Carmen ;
Acha, Begona ;
Serrano-Gotarredona, Teresa ;
Chen, Shouchun ;
Linares-Barranco, Bernabe .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (11) :2706-2719
[7]   Dynamic coding of signed quantities in cortical feedback circuits [J].
Ballard, Dana H. ;
Jehee, Janneke .
FRONTIERS IN PSYCHOLOGY, 2012, 3
[8]  
Bhaskaran V., 1997, Image and video compression standards: algorithms and architectures
[9]   Point-to-point connectivity between neuromorphic chips using address events [J].
Boahen, KA .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2000, 47 (05) :416-434
[10]   Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition [J].
Cao, Yongqiang ;
Chen, Yang ;
Khosla, Deepak .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 113 (01) :54-66