NeRF-VO: Real-Time Sparse Visual Odometry With Neural Radiance Fields

被引:5
作者
Naumann, Jens [1 ]
Xu, Binbin [2 ]
Leutenegger, Stefan [1 ]
Zuo, Xingxing [1 ,3 ]
机构
[1] Tech Univ Munich, D-80333 Munich, Germany
[2] Univ Toronto, Toronto, ON M3H 5T6, Canada
[3] CALTECH, Pasadena, CA 91125 USA
关键词
NeRF; visual odometry; SLAM; dense mapping;
D O I
10.1109/LRA.2024.3421192
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
We introduce a novel monocular visual odometry (VO) system, NeRF-VO, that integrates learning-based sparse visual odometry for low-latency camera tracking and a neural radiance scene representation for fine-detailed dense reconstruction and novel view synthesis. Our system initializes camera poses using sparse visual odometry and obtains view-dependent dense geometry priors from a monocular prediction network. We harmonize the scale of poses and dense geometry, treating them as supervisory cues to train a neural implicit scene representation. NeRF-VO demonstrates exceptional performance in both photometric and geometric fidelity of the scene representation by jointly optimizing a sliding window of keyframed poses and the underlying dense geometry, which is accomplished through training the radiance field with volume rendering. We surpass SOTA methods in pose estimation accuracy, novel view synthesis fidelity, and dense reconstruction quality across a variety of synthetic and real-world datasets while achieving a higher camera tracking frequency and consuming less GPU memory.
引用
收藏
页码:7278 / 7285
页数:8
相关论文
共 61 条
[1]   Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields [J].
Barron, Jonathan T. ;
Mildenhall, Ben ;
Verbin, Dor ;
Srinivasan, Pratul P. ;
Hedman, Peter .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :5460-5469
[2]   NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis [J].
Ben Mildenhall ;
Srinivasan, Pratul P. ;
Tancik, Matthew ;
Barron, Jonathan T. ;
Ramamoorthi, Ravi ;
Ng, Ren .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :405-421
[3]   CodeSLAM-Learning a Compact, Optimisable Representation for Dense Visual SLAM [J].
Bloesch, Michael ;
Czarnowski, Jan ;
Clark, Ronald ;
Leutenegger, Stefan ;
Davison, Andrew J. .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2560-2568
[4]   ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial, and Multimap SLAM [J].
Campos, Carlos ;
Elvira, Richard ;
Gomez Rodriguez, Juan J. ;
Montiel, Jose M. M. ;
Tardos, Juan D. .
IEEE TRANSACTIONS ON ROBOTICS, 2021, 37 (06) :1874-1890
[5]   TensoRF: Tensorial Radiance Fields [J].
Chen, Anpei ;
Xu, Zexiang ;
Geiger, Andreas ;
Yu, Jingyi ;
Su, Hao .
COMPUTER VISION - ECCV 2022, PT XXXII, 2022, 13692 :333-350
[6]   Orbeez-SLAM: A Real-time Monocular Visual SLAM with ORB Features and NeRF-realized Mapping [J].
Chung, Chi-Ming ;
Tseng, Yang-Che ;
Hsu, Ya-Ching ;
Shi, Xiang-Qian ;
Hua, Yun-Hung ;
Yeh, Jia-Fong ;
Chen, Wen-Chin ;
Chen, Yi-Ting ;
Hsu, Winston H. .
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, :9400-9406
[7]   SOFT2: Stereo Visual Odometry for Road Vehicles Based on a Point-to-Epipolar-Line Metric [J].
Cvisic, Igor ;
Markovic, Ivan ;
Petrovic, Ivan .
IEEE TRANSACTIONS ON ROBOTICS, 2023, 39 (01) :273-288
[8]   DeepFactors: Real-Time Probabilistic Dense Monocular SLAM [J].
Czarnowski, Jan ;
Laidlow, Tristan ;
Clark, Ronald ;
Davison, Andrew J. .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02) :721-728
[9]   ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes [J].
Dai, Angela ;
Chang, Angel X. ;
Savva, Manolis ;
Halber, Maciej ;
Funkhouser, Thomas ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2432-2443
[10]  
Dai M., 2017, ACM Trans. Graph., V36