RoMa: Robust Dense Feature Matching

被引:25
作者
Edstedt, Johan [1 ]
Sun, Qiyu [2 ]
Bokman, Georg [3 ]
Wadenback, Marten [1 ]
Felsberg, Michael [1 ]
机构
[1] Linkoping Univ, Linkoping, Sweden
[2] East China Univ Sci & Technol, Shanghai, Peoples R China
[3] Chalmers Univ Technol, Gothenburg, Sweden
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2024年
基金
瑞典研究理事会;
关键词
D O I
10.1109/CVPR52733.2024.01871
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature matching is an important computer vision task that involves estimating correspondences between two images of a 3D scene, and dense methods estimate all such correspondences. The aim is to learn a robust model, i.e., a model able to match under challenging real-world changes. In this work, we propose such a model, leveraging frozen pretrained features from the foundation model DINOv2. Although these features are significantly more robust than local features trained from scratch, they are inherently coarse. We therefore combine them with specialized ConvNet fine features, creating a precisely localizable feature pyramid. To further improve robustness, we propose a tailored transformer match decoder that predicts anchor probabilities, which enables it to express multimodality. Finally, we propose an improved loss formulation through regression-by-classification with subsequent robust regression. We conduct a comprehensive set of experiments that show that our method, RoMa, achieves significant gains, setting a new state-of-the-art. In particular, we achieve a 36% improvement on the extremely challenging WxBS benchmark. Code is provided at github.com/Parskatt/RoMa.
引用
收藏
页码:19790 / 19800
页数:11
相关论文
共 67 条
[1]   Electrochemical Impedance Spectroscopy on 2D Nanomaterial MXene Modified Interfaces: Application as a Characterization and Transducing Tool [J].
Aguedo, Juvissan ;
Lorencova, Lenka ;
Barath, Marek ;
Farkas, Pavol ;
Tkac, Jan .
CHEMOSENSORS, 2020, 8 (04) :1-21
[2]  
[Anonymous], 1994, Journal of applied statistics, DOI DOI 10.1080/757582976
[3]   H-Patches: A Benchmark and Evaluation of Handcrafted and Learned Local Descriptors [J].
Balntas, Vassileios ;
Lenc, Karel ;
Vedaldi, Andrea ;
Tuytelaars, Tinne ;
Matas, Jiri ;
Mikolajczyk, Krystian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (11) :2825-2841
[4]   A General and Adaptive Robust Loss Function [J].
Barron, Jonathan T. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4326-4334
[5]   Speeded-Up Robust Features (SURF) [J].
Bay, Herbert ;
Ess, Andreas ;
Tuytelaars, Tinne ;
Van Gool, Luc .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) :346-359
[6]   The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields [J].
Black, MJ ;
Anandan, P .
COMPUTER VISION AND IMAGE UNDERSTANDING, 1996, 63 (01) :75-104
[7]   On the unification of line processes, outlier rejection, and robust statistics with applications in early vision [J].
Black, MJ ;
Rangarajan, A .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 1996, 19 (01) :57-91
[8]   A case for using rotation invariant features in state of the art feature matchers [J].
Bokman, Georg ;
Kahl, Fredrik .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, :5106-5115
[9]  
Bommasani Rishi, 2021, arXiv
[10]  
Budvytis Ignas, 2019, P BRIT MACH VIS C BM