Visual-Auditory saliency detection using event-driven visual sensors

被引:0
作者
Akolkar, Himanshu [1 ]
Valeiras, David Reverter [2 ]
Benosman, Ryad [2 ]
Bartolozzi, Chiara [1 ]
机构
[1] Ist Italiano Tecnol, ICub Facil, I-16163 Genoa, Italy
[2] Univ Paris 06, Vis Inst, F-75012 Paris, France
来源
PROCEEDINGS OF FIRST INTERNATIONAL CONFERENCE ON EVENT-BASED CONTROL, COMMUNICATION AND SIGNAL PROCESSING EBCCSP 2015 | 2015年
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a novel architecture for audiovisual saliency detection using event-based visual sensors and traditional microphones installed on the head of a humanoid robot. In the context of collision detection, salient sensory events must be detected at the same time in vision and in the auditory domain. Real collisions in the visual space can be distinguished from fake ones (e.g. due to movements of two objects that occlude each other) because they generate a sound at the time of collision. This temporal coincidence is extremely difficult to detect with frame-based sensors, that intrinsically add a fixed delay in the sensory acquisition or can miss the collision. The high temporal resolution of event-driven vision sensors together with a real time clustering and tracking algorithm allow for the detection of potential collisions with very low latency. Auditory events corresponding to collisions are detected using simple spectral analysis of auditory signals. The visual event can be therefore temporally integrated with coherently occurring auditory events to detect fast-transitions and disentangle real collisions from visual or auditory events that do not correspond to any. The proposed audio-visual collision detection is used in the context of human robot interaction, to detect people clapping in front of the robot and orient its gaze toward the perceived collision.
引用
收藏
页数:6
相关论文
共 24 条
  • [1] Akolkar H., 2015, ROB AUT ICRA 2015 IE
  • [2] What Can Neuromorphic Event-Driven Precise Timing Add to Spike-Based Pattern Recognition?
    Akolkar, Himanshu
    Meyer, Cedric
    Clady, Zavier
    Marre, Olivier
    Bartolozzi, Chiara
    Panzeri, Stefano
    Benosman, Ryad
    [J]. NEURAL COMPUTATION, 2015, 27 (03) : 561 - 593
  • [3] [Anonymous], 2010, HUMAN TRACKING FOLLO
  • [4] [Anonymous], 2008, BEAT TRACKING ROBOT
  • [5] Bartolozzi C., 2011, CVPR 2011 WORKSH, P129
  • [6] Event-Based Visual Flow
    Benosman, Ryad
    Clercq, Charles
    Lagorce, Xavier
    Ieng, Sio-Hoi
    Bartolozzi, Chiara
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (02) : 407 - 417
  • [7] Viosual servo control - Part I: Basic approaches
    Chaumette, Francois
    Hutchinson, Seth
    [J]. IEEE ROBOTICS & AUTOMATION MAGAZINE, 2006, 13 (04) : 82 - 90
  • [8] COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES
    DAVIS, SB
    MERMELSTEIN, P
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04): : 357 - 366
  • [9] Delbrück T, 2010, IEEE INT SYMP CIRC S, P2426, DOI 10.1109/ISCAS.2010.5537149
  • [10] A model-based sound localization system and its application to robot navigation
    Huang, J
    Supaongprapa, T
    Terakura, I
    Wang, FM
    Ohnishi, N
    Sugie, N
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 1999, 27 (04) : 199 - 209