Visual Localization using Imperfect 3D Models from the Internet

被引：11

作者：

Panek, Vojtech ^{[1
,2
]}

Kukelova, Zuzana ^{[3
]}

Sattler, Torsten ^{[2
]}

机构：

[1] Czech Tech Univ, Fac Elect Engn, Prague, Czech Republic

[2] Czech Tech Univ, Czech Inst Informat Robot & Cybernet, Prague, Czech Republic

[3] Czech Tech Univ, Visual Recognit Grp, Fac Elect Engn, Prague, Czech Republic

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

关键词：

POSE ESTIMATION; IMAGE; ALIGNMENT; OBJECTS; WORLD;

D O I：

10.1109/CVPR52729.2023.01266

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Visual localization is a core component in many applications, including augmented reality (AR). Localization algorithms compute the camera pose of a query image w.r.t. a scene representation, which is typically built from images. This often requires capturing and storing large amounts of data, followed by running Structure-from-Motion (SfM) algorithms. An interesting, and underexplored, source of data for building scene representations are 3D models that are readily available on the Internet, e.g., hand-drawn CAD models, 3D models generated from building footprints, or from aerial images. These models allow to perform visual localization right away without the time-consuming scene capturing and model building steps. Yet, it also comes with challenges as the available 3D models are often imperfect reflections of reality. E.g., the models might only have generic or no textures at all, might only provide a simple approximation of the scene geometry, or might be stretched. This paper studies how the imperfections of these models affect localization accuracy. We create a new benchmark for this task and provide a detailed experimental evaluation based on multiple 3D models per scene. We show that 3D models from the Internet show promise as an easy-to-obtain scene representation. At the same time, there is significant room for improvement for visual localization pipelines. To foster research on this interesting and challenging task, we release our benchmark at v-pnk.github.io/cadloc.

引用

页码：13175 / 13186

页数：12

共 50 条

[41] Object recognition and localization from 3D point clouds by maximum-likelihood estimation
Dantanarayana, Harshana G.
Huntley, Jonathan M.
ROYAL SOCIETY OPEN SCIENCE, 2017, 4 (08):
[42] Memory and visual search in naturalistic 2D and 3D environments
Li, Chia-Ling
Pilar Aivar, M.
Kit, Dmitry M.
Tong, Matthew H.
Hayhoe, Mary M.
JOURNAL OF VISION, 2016, 16 (08):
[43] Animation Design Based on 3D Visual Communication Technology
Shan, Feng
Wang, Youya
SCIENTIFIC PROGRAMMING, 2022, 2022
[44] Survey on computational 3D visual optical art design
Wu, Kang
Fu, Xiao-Ming
Chen, Renjie
Liu, Ligang
VISUAL COMPUTING FOR INDUSTRY BIOMEDICINE AND ART, 2022, 5 (01)
[45] Aligning 3D polygonal models with improved PCA
Liu Wei
He Yuanjun
INTERACTIVE TECHNOLOGIES AND SOCIOTECHNICAL SYSTEMS, 2006, 4270 : 263 - 268
[46] Visual Servoing Control Based on Reconstructed 3D Features
Xu, Degang
Zhou, Lei
Lei, Yifan
Shen, Tiantian
INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2019, PT V, 2019, 11744 : 553 - 564
[47] 3D Vehicle Pose Estimation from an Image Using Geometry
Stojanovic, Nikola
Pantic, Vasilije
Damjanovic, Vladan
Vukmirovic, Srdan
2022 21ST INTERNATIONAL SYMPOSIUM INFOTEH-JAHORINA (INFOTEH), 2022,
[48] View subspaces for indexing and retrieval of 3D models
Dutagaci, Helin
Godil, Afzal
Sankur, Bulent
Yemez, Yucel
THREE-DIMENSIONAL IMAGE PROCESSING (3DIP) AND APPLICATIONS, 2010, 7526
[49] Subspace methods for retrieval of general 3D models
Dutagaci, Helin
Sankur, Buelent
Yemez, Yuecel
COMPUTER VISION AND IMAGE UNDERSTANDING, 2010, 114 (08) : 865 - 886
[50] 3D SHAPE RECONSTRUCTION ENDOSCOPE USING SHAPE FROM FOCUS
Takeshita, T.
Nakajima, Y.
Kim, M. K.
Onogi, S.
Mitsuishi, M.
Matsumoto, Y.
VISAPP 2009: PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 1, 2009, : 411 - +

← 1 2 3 4 5 →