SCENES: Subpixel Correspondence Estimation With Epipolar Supervision

被引:0
作者
Kloepfer, Dominik A. [1 ]
Henriques, Joao F. [1 ]
Campbell, Dylan [2 ]
机构
[1] Univ Oxford, Visual Geometry Grp, Oxford, England
[2] Australian Natl Univ, Sch Comp, Canberra, ACT 0200, Australia
来源
2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024 | 2024年
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1109/3DV62453.2024.00137
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extracting point correspondences from two or more views of a scene is a fundamental computer vision problem with particular importance for relative camera pose estimation and structure-from-motion. Existing local feature matching approaches, trained with correspondence supervision on large-scale datasets, obtain highly-accurate matches on the test sets. However, they do not generalise well to new datasets with different characteristics to those they were trained on, unlike classic feature extractors. Instead, they require finetuning, which assumes that groundtruth correspondences or ground-truth camera poses and 3D structure are available. We relax this assumption by removing the requirement of 3D structure, e.g., depth maps or point clouds, and only require camera pose information, which can be obtained from odometry. We do so by replacing correspondence losses with epipolar losses, which encourage putative matches to lie on the associated epipolar line. While weaker than correspondence supervision, we observe that this cue is sufficient for finetuning existing models on new data. We then further relax the assumption of known camera poses by using pose estimates in a novel bootstrapping approach. We evaluate on highly challenging datasets, including an indoor drone dataset and an outdoor smartphone camera dataset, and obtain state-of-the-art results without strong supervision.
引用
收藏
页码:21 / 30
页数:10
相关论文
共 48 条
  • [1] Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/TPAMI.2017.2711011, 10.1109/CVPR.2016.572]
  • [2] Lucas-Kanade 20 years on: A unifying framework
    Baker, S
    Matthews, I
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 56 (03) : 221 - 255
  • [3] Speeded-Up Robust Features (SURF)
    Bay, Herbert
    Ess, Andreas
    Tuytelaars, Tinne
    Van Gool, Luc
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) : 346 - 359
  • [4] A Light Touch Approach to Teaching Transformers Multi-view Geometry
    Bhalgat, Yash
    Henriques, Joao F.
    Zisserman, Andrew
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 4958 - 4969
  • [5] Bradski G, 2000, DR DOBBS J, V25, P120
  • [6] The EuRoC micro aerial vehicle datasets
    Burri, Michael
    Nikolic, Janosch
    Gohl, Pascal
    Schneider, Thomas
    Rehder, Joern
    Omari, Sammy
    Achtelik, Markus W.
    Siegwart, Roland
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2016, 35 (10) : 1157 - 1163
  • [7] Chen D, 2011, CONF REC ASILOMAR C, P850, DOI 10.1109/ACSSC.2011.6190128
  • [8] ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer
    Chen, Hongkai
    Luo, Zixin
    Zhou, Lei
    Tian, Yurun
    Zhen, Mingmin
    Fang, Tian
    McKinnon, David
    Tsin, Yanghai
    Quan, Long
    [J]. COMPUTER VISION - ECCV 2022, PT XXXII, 2022, 13692 : 20 - 36
  • [9] Learning to Match Features with Seeded Graph Matching Network
    Chen, Hongkai
    Luo, Zixin
    Zhang, Jiahui
    Zhou, Lei
    Bai, Xuyang
    Hu, Zeyu
    Tai, Chiew-Lan
    Quan, Long
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6281 - 6290
  • [10] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
    Dai, Angela
    Chang, Angel X.
    Savva, Manolis
    Halber, Maciej
    Funkhouser, Thomas
    Niessner, Matthias
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2432 - 2443