Event-Based Monocular Depth Estimation With Recurrent Transformers

被引：5

作者：

Liu, Xu ^{[1
,2
]}

Li, Jianing ^{[3
]}

Shi, Jinqiao ^{[4
]}

Fan, Xiaopeng ^{[1
,2
]}

Tian, Yonghong ^{[2
,3
]}

Zhao, Debin ^{[1
,2
]}

机构：

[1] Harbin Inst Technol, Res Ctr Intelligent Interface & Human Comp Interac, Dept Comp Sci & Technol, Harbin 150001, Peoples R China

[2] Peng Cheng Lab, Shenzhen 518000, Peoples R China

[3] Peking Univ, Sch Comp Sci, Beijing 100871, Peoples R China

[4] Beijing Univ Posts & Telecommun, Sch Cyberspace Secur, Beijing 100871, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Transformers; Estimation; Cameras; Voltage control; Task analysis; Streaming media; Circuits and systems; Event camera; monocular depth estimator; recurrent transformer; cross attention; VISION;

D O I：

10.1109/TCSVT.2024.3378742

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Event cameras, offering high temporal resolutions and high dynamic ranges, have brought a new perspective to address common challenges in monocular depth estimation (e.g., motion blur and low light). However, existing CNN-based methods insufficiently exploit global spatial information from asynchronous events, while RNN-based methods show a limited capacity for effective temporal cues utilization for event-based monocular depth estimation. To this end, we propose a event-based monocular depth estimator with recurrent transformers, namely EReFormer. Technically, we first design a transformer-based encoder-decoder that utilizes multi-scale features to model global spatial information from events. Then, we propose a Gate Recurrent Vision Transformer (GRViT), introducing a recursive mechanism into transformers, to leverage rich temporal cues from events. Finally, we present a Cross Attention-guided Skip Connection (CASC), performing cross attention to fuse multi-scale features, to improve global spatial modeling capabilities. The experimental results show that our EReFormer outperforms state-of-the-art methods by a margin on both synthetic and real-world datasets. Our open-source code is available at https://github.com/liuxu0303/EReFormer.

引用

页码：7417 / 7429

页数：13

共 50 条

[41] SpikingViT: A Multiscale Spiking Vision Transformer Model for Event-Based ObjectDetection
Yu, Lixing
Chen, Hanqi
Wang, Ziming
Zhan, Shaojie
Shao, Jiankun
Liu, Qingjie
Xu, Shu
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2025, 17 (01) : 130 - 146
[42] Unsupervised Event-based Learning of Optical Flow, Depth, and Egomotion
Zhu, Alex Zihao
Yuan, Liangzhe
Chaney, Kenneth
Daniilidis, Kostas
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 989 - 997
[43] Self-Supervised Deep Monocular Depth Estimation With Ambiguity Boosting
Bello, Juan Luis Gonzalez
Kim, Munchurl
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9131 - 9149
[44] AWDepth: Monocular Depth Estimation for Adverse Weather via Masked Encoding
Wang, Meng
Qin, Yunchuan
Li, Ruihui
Liu, Zhizhong
Tang, Zhuo
Li, Kenli
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (09) : 10873 - 10882
[45] Velocity and Color Estimation Using Event-Based Clustering
Lesage, Xavier
Tran, Rosalie
Mancini, Stephane
Fesquet, Laurent
Popescu, Dan
Ichim, Loretta
SENSORS, 2023, 23 (24)
[46] CORNet: Context-Based Ordinal Regression Network for Monocular Depth Estimation
Meng, Xuyang
Fan, Chunxiao
Ming, Yue
Yu, Hui
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (07) : 4841 - 4853
[47] Event-Based Head Pose Estimation: Benchmark and Method
Yuan, Jiahui
Li, Hebei
Peng, Yansong
Wang, Jin
Jiang, Yuheng
Zhang, Yueyi
Sun, Xiaoyan
COMPUTER VISION - ECCV 2024, PT XV, 2025, 15073 : 191 - 208
[48] Sparse Pseudo-LiDAR Depth Assisted Monocular Depth Estimation
Shao, Shuwei
Pei, Zhongcai
Chen, Weihai
Liu, Qiang
Yue, Haosong
Li, Zhengguo
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 917 - 929
[49] DNA-Depth: A Frequency-Based Day-Night Adaptation for Monocular Depth Estimation
Shen, Mengjiao
Wang, Zhongyi
Su, Shuai
Liu, Chengju
Chen, Qijun
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[50] Monocular Depth Estimation Based on Multi-Scale Depth Map Fusion
Yang, Xin
Chang, Qingling
Liu, Xinglin
He, Siyuan
Cui, Yan
IEEE ACCESS, 2021, 9 : 67696 - 67705

← 1 2 3 4 5 →