Event-Based Monocular Depth Estimation With Recurrent Transformers

被引:5
作者
Liu, Xu [1 ,2 ]
Li, Jianing [3 ]
Shi, Jinqiao [4 ]
Fan, Xiaopeng [1 ,2 ]
Tian, Yonghong [2 ,3 ]
Zhao, Debin [1 ,2 ]
机构
[1] Harbin Inst Technol, Res Ctr Intelligent Interface & Human Comp Interac, Dept Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518000, Peoples R China
[3] Peking Univ, Sch Comp Sci, Beijing 100871, Peoples R China
[4] Beijing Univ Posts & Telecommun, Sch Cyberspace Secur, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
Transformers; Estimation; Cameras; Voltage control; Task analysis; Streaming media; Circuits and systems; Event camera; monocular depth estimator; recurrent transformer; cross attention; VISION;
D O I
10.1109/TCSVT.2024.3378742
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Event cameras, offering high temporal resolutions and high dynamic ranges, have brought a new perspective to address common challenges in monocular depth estimation (e.g., motion blur and low light). However, existing CNN-based methods insufficiently exploit global spatial information from asynchronous events, while RNN-based methods show a limited capacity for effective temporal cues utilization for event-based monocular depth estimation. To this end, we propose a event-based monocular depth estimator with recurrent transformers, namely EReFormer. Technically, we first design a transformer-based encoder-decoder that utilizes multi-scale features to model global spatial information from events. Then, we propose a Gate Recurrent Vision Transformer (GRViT), introducing a recursive mechanism into transformers, to leverage rich temporal cues from events. Finally, we present a Cross Attention-guided Skip Connection (CASC), performing cross attention to fuse multi-scale features, to improve global spatial modeling capabilities. The experimental results show that our EReFormer outperforms state-of-the-art methods by a margin on both synthetic and real-world datasets. Our open-source code is available at https://github.com/liuxu0303/EReFormer.
引用
收藏
页码:7417 / 7429
页数:13
相关论文
共 50 条
  • [41] SpikingViT: A Multiscale Spiking Vision Transformer Model for Event-Based ObjectDetection
    Yu, Lixing
    Chen, Hanqi
    Wang, Ziming
    Zhan, Shaojie
    Shao, Jiankun
    Liu, Qingjie
    Xu, Shu
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2025, 17 (01) : 130 - 146
  • [42] Unsupervised Event-based Learning of Optical Flow, Depth, and Egomotion
    Zhu, Alex Zihao
    Yuan, Liangzhe
    Chaney, Kenneth
    Daniilidis, Kostas
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 989 - 997
  • [43] Self-Supervised Deep Monocular Depth Estimation With Ambiguity Boosting
    Bello, Juan Luis Gonzalez
    Kim, Munchurl
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9131 - 9149
  • [44] AWDepth: Monocular Depth Estimation for Adverse Weather via Masked Encoding
    Wang, Meng
    Qin, Yunchuan
    Li, Ruihui
    Liu, Zhizhong
    Tang, Zhuo
    Li, Kenli
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (09) : 10873 - 10882
  • [45] Velocity and Color Estimation Using Event-Based Clustering
    Lesage, Xavier
    Tran, Rosalie
    Mancini, Stephane
    Fesquet, Laurent
    Popescu, Dan
    Ichim, Loretta
    SENSORS, 2023, 23 (24)
  • [46] CORNet: Context-Based Ordinal Regression Network for Monocular Depth Estimation
    Meng, Xuyang
    Fan, Chunxiao
    Ming, Yue
    Yu, Hui
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (07) : 4841 - 4853
  • [47] Event-Based Head Pose Estimation: Benchmark and Method
    Yuan, Jiahui
    Li, Hebei
    Peng, Yansong
    Wang, Jin
    Jiang, Yuheng
    Zhang, Yueyi
    Sun, Xiaoyan
    COMPUTER VISION - ECCV 2024, PT XV, 2025, 15073 : 191 - 208
  • [48] Sparse Pseudo-LiDAR Depth Assisted Monocular Depth Estimation
    Shao, Shuwei
    Pei, Zhongcai
    Chen, Weihai
    Liu, Qiang
    Yue, Haosong
    Li, Zhengguo
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 917 - 929
  • [49] DNA-Depth: A Frequency-Based Day-Night Adaptation for Monocular Depth Estimation
    Shen, Mengjiao
    Wang, Zhongyi
    Su, Shuai
    Liu, Chengju
    Chen, Qijun
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [50] Monocular Depth Estimation Based on Multi-Scale Depth Map Fusion
    Yang, Xin
    Chang, Qingling
    Liu, Xinglin
    He, Siyuan
    Cui, Yan
    IEEE ACCESS, 2021, 9 : 67696 - 67705