CrossEI: Boosting Motion-Oriented Object Tracking With an Event Camera

被引:0
作者
Chen, Zhiwen [1 ]
Wu, Jinjian [1 ]
Dong, Weisheng [1 ]
Li, Leida [1 ]
Shi, Guangming [1 ]
机构
[1] Xidian Univ, Sch Artificial Intelligence, Xian 710071, Peoples R China
关键词
Tracking; Cameras; Feature extraction; Semantics; Object tracking; Motion estimation; Modulation; Dynamics; Sensitivity; Benchmark testing; Event camera; event-image fusion; object tracking; VISION;
D O I
10.1109/TIP.2024.3505672
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the differential sensitivity and high time resolution, event cameras can record detailed motion clues, which form a complementary advantage with frame-based cameras to enhance the object tracking, especially in challenging dynamic scenes. However, how to better match heterogeneous event-image data and exploit rich complementary cues from them still remains an open issue. In this paper, we align event-image modalities by proposing a motion adaptive event sampling method, and we revisit the cross-complementarities of event-image data to design a bidirectional-enhanced fusion framework. Specifically, this sampling strategy can adapt to different dynamic scenes and integrate aligned event-image pairs. Besides, we design an image-guided motion estimation unit for extracting explicit instance-level motions, aiming at refining the uncertain event clues to distinguish primary objects and background. Then, a semantic modulation module is devised to utilize the enhanced object motion to modify the image features. Coupled with these two modules, this framework learns both the high motion sensitivity of events and the full texture of images to achieve more accurate and robust tracking. The proposed method is easily embedded in existing tracking pipelines, and trained end-to-end. We evaluate it on four large benchmarks, i.e. FE108, VisEvent, FE240hz and CoeSot. Extensive experiments demonstrate our method achieves state-of-the-art performance, and large improvements are pointed as contributions by our sampling strategy and fusion concept.
引用
收藏
页码:73 / 84
页数:12
相关论文
共 63 条
  • [1] Barranco F, 2018, IEEE INT C INT ROBOT, P5764, DOI 10.1109/IROS.2018.8593380
  • [2] Bhat Goutam, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12368), P205, DOI 10.1007/978-3-030-58592-1_13
  • [3] Learning Discriminative Model Prediction for Tracking
    Bhat, Goutam
    Danelljan, Martin
    Van Gool, Luc
    Timofte, Radu
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6181 - 6190
  • [4] A 240 x 180 130 dB 3 μs Latency Global Shutter Spatiotemporal Vision Sensor
    Brandli, Christian
    Berner, Raphael
    Yang, Minhao
    Liu, Shih-Chii
    Delbruck, Tobi
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2014, 49 (10) : 2333 - 2341
  • [5] Fusion-Based Feature Attention Gate Component for Vehicle Detection Based on Event Camera
    Cao, Hu
    Chen, Guang
    Xia, Jiahao
    Zhuang, Genghang
    Knoll, Alois
    [J]. IEEE SENSORS JOURNAL, 2021, 21 (21) : 24540 - 24548
  • [6] Chen HS, 2020, AAAI CONF ARTIF INTE, V34, P10534
  • [7] Transformer Tracking
    Chen, Xin
    Yan, Bin
    Zhu, Jiawen
    Wang, Dong
    Yang, Xiaoyun
    Lu, Huchuan
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8122 - 8131
  • [8] Probabilistic Regression for Visual Tracking
    Danelljan, Martin
    Van Gool, Luc
    Timofte, Radu
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 7181 - 7190
  • [9] ATOM: Accurate Tracking by Overlap Maximization
    Danelljan, Martin
    Bhat, Goutam
    Khan, Fahad Shahbaz
    Felsberg, Michael
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4655 - 4664
  • [10] Delbrück T, 2010, IEEE INT SYMP CIRC S, P2426, DOI 10.1109/ISCAS.2010.5537149