Learning context-aware local feature descriptors for 3D reconstruction

被引:0
作者
Yang, Jian [1 ]
Zhou, Jian [2 ]
Fan, Hao [1 ]
Dong, Junyu [1 ]
Yu, Hui [3 ]
机构
[1] Ocean Univ China, Dept Informat Sci & Technol, Qingdao 266100, Peoples R China
[2] Qingdao Univ, Dept Business Management, Qingdao 266100, Peoples R China
[3] Univ Portsmouth, Sch Creat Technol, Portsmouth PO1 2UP, England
基金
中国国家自然科学基金;
关键词
Feature descriptor; Feature matching; Visual localization; 3D reconstruction;
D O I
10.1016/j.neucom.2024.127793
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The generation of generalizable and discriminative descriptors plays a crucial role in image matching and 3D reconstruction. While numerous existing solutions are concentrated on encoding specific invariances, such as illumination or viewpoint invariance, they often face challenges in achieving robustness and generalization. These challenges arise from the frequent inadequacy of these solutions to effectively adapt to diverse and demanding environments due to their limited information capacity. In this paper, we introduce a novel approach aimed at maximizing the utilization of hidden feature informativeness to address these challenges. Specifically, we propose the Hierarchical Context -aware Aggregation Network (HCNet), which employs a hierarchical dense features constraint in a coarse -to -refinement description manner. In this approach, a coarselevel descriptor is used to present the overall information, while the refinement descriptor captures the detailed information of the image. Leveraging the strengths of both CNN and Transformer architectures, our hierarchical dense feature constraint encodes both local features and long-range information to efficiently generate dense feature descriptions. To boost descriptor informativeness and enhance matching accuracy, we introduce the Context -aware Attention Aggregation (CAA) model, which adaptively aggregates features from various scales through an efficient coarse -to -refinement manner. Additionally, we design a hierarchical triplet training strategy that considers both variant and invariant properties of hierarchical features, aiming to enhance descriptor informativeness while preserving their strong discriminative qualities. Our experiments, conducted on two popular feature -matching benchmarks, as well as a challenging long-term visual localization benchmark, demonstrate that our method significantly improves matching accuracy and outperforms stateof-the-art descriptors. Moreover, our approach exhibits superior generalization capabilities in various 3D reconstruction scenarios.
引用
收藏
页数:12
相关论文
共 61 条
[1]   Large-Scale Data for Multiple-View Stereopsis [J].
Aanaes, Henrik ;
Jensen, Rasmus Ramsbol ;
Vogiatzis, George ;
Tola, Engin ;
Dahl, Anders Bjorholm .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2016, 120 (02) :153-168
[2]   HPatches: A benchmark and evaluation of handcrafted and learned local descriptors [J].
Balntas, Vassileios ;
Lenc, Karel ;
Vedaldi, Andrea ;
Mikolajczyk, Krystian .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3852-3861
[3]  
Balntas Vassileios, 2016, PROCEDINGS BRIT MACH, P1, DOI DOI 10.5244/C.30.119
[4]   Structure-from-motion using lines: Representation, triangulation, and bundle adjustment [J].
Bartoli, A ;
Sturm, P .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2005, 100 (03) :416-441
[5]  
Bartoli A, 2004, LECT NOTES COMPUT SC, V3022, P28
[6]   Speeded-Up Robust Features (SURF) [J].
Bay, Herbert ;
Ess, Andreas ;
Tuytelaars, Tinne ;
Van Gool, Luc .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) :346-359
[7]   SuperPoint: Self-Supervised Interest Point Detection and Description [J].
DeTone, Daniel ;
Malisiewicz, Tomasz ;
Rabinovich, Andrew .
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, :337-349
[8]  
Dusmanu M, 2019, Arxiv, DOI [arXiv:1905.03561, DOI 10.48550/ARXIV.1905.03561]
[9]   NTIRE 2021 Depth Guided Image Relighting Challenge [J].
El Helou, Majed ;
Zhou, Ruofan ;
Susstrunk, Sabine ;
Timofte, Radu ;
Suin, Maitreya ;
Rajagopalan, A. N. ;
Wang, Yuanzhi ;
Lu, Tao ;
Zhang, Yanduo ;
Wu, Yuntao ;
Yang, Hao-Hsiang ;
Chen, Wei-Ting ;
Kuo, Sy-Yen ;
Luo, Hao-Lun ;
Zhang, Zhiguang ;
Luo, Zhipeng ;
He, Jianye ;
Zhu, Zuo-Liang ;
Li, Zhen ;
Qiu, Jia-Xiong ;
Kuang, Zeng-Sheng ;
Lu, Cheng-Ze ;
Cheng, Ming-Ming ;
Shao, Xiu-Li ;
Li, Chenghua ;
Ding, Bosong ;
Qian, Wanli ;
Li, Fangya ;
Li, Fu ;
Deng, Ruifeng ;
Lin, Tianwei ;
Liu, Songhua ;
Li, Xin ;
He, Dongliang ;
Yazdani, Amirsaeed ;
Guo, Tiantong ;
Monga, Vishal ;
Nsampi, Ntumba Elie ;
Hu, Zhongyun ;
Wang, Qing ;
Nathan, Sabari ;
Kansal, Priya ;
Zhao, Tongtong ;
Zhao, Shanshan .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, :566-577
[10]   Seeing Through Darkness: Visual Localization at Night via Weakly Supervised Learning of Domain Invariant Features [J].
Fan, Bin ;
Yang, Yuzhu ;
Feng, Wensen ;
Wu, Fuchao ;
Lu, Jiwen ;
Liu, Hongmin .
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 :1713-1726