Visual Localization using Imperfect 3D Models from the Internet

被引：11

作者：

Panek, Vojtech ^{[1
,2
]}

Kukelova, Zuzana ^{[3
]}

Sattler, Torsten ^{[2
]}

机构：

[1] Czech Tech Univ, Fac Elect Engn, Prague, Czech Republic

[2] Czech Tech Univ, Czech Inst Informat Robot & Cybernet, Prague, Czech Republic

[3] Czech Tech Univ, Visual Recognit Grp, Fac Elect Engn, Prague, Czech Republic

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

关键词：

POSE ESTIMATION; IMAGE; ALIGNMENT; OBJECTS; WORLD;

D O I：

10.1109/CVPR52729.2023.01266

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Visual localization is a core component in many applications, including augmented reality (AR). Localization algorithms compute the camera pose of a query image w.r.t. a scene representation, which is typically built from images. This often requires capturing and storing large amounts of data, followed by running Structure-from-Motion (SfM) algorithms. An interesting, and underexplored, source of data for building scene representations are 3D models that are readily available on the Internet, e.g., hand-drawn CAD models, 3D models generated from building footprints, or from aerial images. These models allow to perform visual localization right away without the time-consuming scene capturing and model building steps. Yet, it also comes with challenges as the available 3D models are often imperfect reflections of reality. E.g., the models might only have generic or no textures at all, might only provide a simple approximation of the scene geometry, or might be stretched. This paper studies how the imperfections of these models affect localization accuracy. We create a new benchmark for this task and provide a detailed experimental evaluation based on multiple 3D models per scene. We show that 3D models from the Internet show promise as an easy-to-obtain scene representation. At the same time, there is significant room for improvement for visual localization pipelines. To foster research on this interesting and challenging task, we release our benchmark at v-pnk.github.io/cadloc.

引用

页码：13175 / 13186

页数：12

共 50 条

[1] Are Large-Scale 3D Models Really Necessary for Accurate Visual Localization?
Torii, Akihiko
Taira, Hajime
Sivic, Josef
Pollefeys, Marc
Okutomi, Masatoshi
Pajdla, Tomas
Sattler, Torsten
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (03) : 814 - 829
[2] Synergizing natural visual features and 3D building models for robust indoor localization in mixed reality environments
Liu, Zhenyu
Blut, Christoph
Blankenbach, Joerg
GEO-SPATIAL INFORMATION SCIENCE, 2025,
[3] 3D Object Manipulation in a Single Photograph using Stock 3D Models
Kholgade, Natasha
Simon, Tomas
Efros, Alexei
Sheikh, Yaser
ACM TRANSACTIONS ON GRAPHICS, 2014, 33 (04):
[4] 3D Visual Phrases for Landmark Recognition
Hao, Qiang
Cai, Rui
Li, Zhiwei
Zhang, Lei
Pang, Yanwei
Wu, Feng
2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 3594 - 3601
[5] 3D Pose Tracking Using a Recovered 3D Model
Chen, Wei
Zhao, Yuelong
Chen, Shu
Ouyang, Jianquan
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2018, 27 (04)
[6] FROM GIS TO BIM AND BACK AGAIN - A SPATIAL QUERY LANGUAGE FOR 3D BUILDING MODELS AND 3D CITY MODELS
Borrmann, A.
5TH INTERNATIONAL CONFERENCE ON 3D GEOINFORMATION, 2010, 38-4 (W15): : 19 - 26
[7] Generating 3D models from a single 2D digitized photo using GIS and GroIMP
Chi, Faustno
Kurth, Winfried
Streit, KatarIna
2016 IEEE INTERNATIONAL CONFERENCE ON FUNCTIONAL-STRUCTURAL PLANT GROWTH MODELING, SIMULATION, VISUALIZATION AND APPLICATIONS (FSPMA), 2016, : 22 - 27
[8] Visual Quality Assessment of 3D Models: On the Influence of Light-Material Interaction
Vanhoey, Kenneth
Sauvage, Basile
Kraemer, Pierre
Lavoue, Guillaume
ACM TRANSACTIONS ON APPLIED PERCEPTION, 2017, 15 (01)
[9] 3D Face Reconstruction from Video Using 3D Morphable Model and Silhouette
Baumberger, Christian
Reyes, Mauricio
Constantinescu, Mihai
Olariu, Radu
de Aguiar, Edilson
Oliveira-Santos, Thiago
2014 27TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2014, : 1 - 8
[10] Cooperative Localization Using the 3D Euler-Lagrange Vehicle Model
Oliveros, Juan Carlos
Ashrafiuon, Hashem
GUIDANCE NAVIGATION AND CONTROL, 2022, 02 (03)

← 1 2 3 4 5 →