SCENES: Subpixel Correspondence Estimation With Epipolar Supervision

被引：0

作者：

Kloepfer, Dominik A. ^{[1
]}

Henriques, Joao F. ^{[1
]}

Campbell, Dylan ^{[2
]}

机构：

[1] Univ Oxford, Visual Geometry Grp, Oxford, England

[2] Australian Natl Univ, Sch Comp, Canberra, ACT 0200, Australia

来源：

2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024 | 2024年

基金：

英国工程与自然科学研究理事会;

关键词：

D O I：

10.1109/3DV62453.2024.00137

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Extracting point correspondences from two or more views of a scene is a fundamental computer vision problem with particular importance for relative camera pose estimation and structure-from-motion. Existing local feature matching approaches, trained with correspondence supervision on large-scale datasets, obtain highly-accurate matches on the test sets. However, they do not generalise well to new datasets with different characteristics to those they were trained on, unlike classic feature extractors. Instead, they require finetuning, which assumes that groundtruth correspondences or ground-truth camera poses and 3D structure are available. We relax this assumption by removing the requirement of 3D structure, e.g., depth maps or point clouds, and only require camera pose information, which can be obtained from odometry. We do so by replacing correspondence losses with epipolar losses, which encourage putative matches to lie on the associated epipolar line. While weaker than correspondence supervision, we observe that this cue is sufficient for finetuning existing models on new data. We then further relax the assumption of known camera poses by using pose estimates in a novel bootstrapping approach. We evaluate on highly challenging datasets, including an indoor drone dataset and an outdoor smartphone camera dataset, and obtain state-of-the-art results without strong supervision.

引用

页码：21 / 30

页数：10

共 48 条

[1] Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/TPAMI.2017.2711011, 10.1109/CVPR.2016.572]
[2] Lucas-Kanade 20 years on: A unifying framework
Baker, S
Matthews, I
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 56 (03) : 221 - 255
[3] Speeded-Up Robust Features (SURF)
Bay, Herbert
Ess, Andreas
Tuytelaars, Tinne
Van Gool, Luc
[J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) : 346 - 359
[4] A Light Touch Approach to Teaching Transformers Multi-view Geometry
Bhalgat, Yash
Henriques, Joao F.
Zisserman, Andrew
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 4958 - 4969
[5] Bradski G, 2000, DR DOBBS J, V25, P120
[6] The EuRoC micro aerial vehicle datasets
Burri, Michael
Nikolic, Janosch
Gohl, Pascal
Schneider, Thomas
Rehder, Joern
Omari, Sammy
Achtelik, Markus W.
Siegwart, Roland
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2016, 35 (10) : 1157 - 1163
[7] Chen D, 2011, CONF REC ASILOMAR C, P850, DOI 10.1109/ACSSC.2011.6190128
[8] ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer
Chen, Hongkai
Luo, Zixin
Zhou, Lei
Tian, Yurun
Zhen, Mingmin
Fang, Tian
McKinnon, David
Tsin, Yanghai
Quan, Long
[J]. COMPUTER VISION - ECCV 2022, PT XXXII, 2022, 13692 : 20 - 36
[9] Learning to Match Features with Seeded Graph Matching Network
Chen, Hongkai
Luo, Zixin
Zhang, Jiahui
Zhou, Lei
Bai, Xuyang
Hu, Zeyu
Tai, Chiew-Lan
Quan, Long
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6281 - 6290
[10] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
Dai, Angela
Chang, Angel X.
Savva, Manolis
Halber, Maciej
Funkhouser, Thomas
Niessner, Matthias
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2432 - 2443

← 1 2 3 4 5 →