Free-form Description Guided 3D Visual Graph Network for Object Grounding in Point Cloud

被引：24

作者：

Feng, Mingtao ^{[1
]}

Li, Zhen ^{[1
]}

Li, Qi ^{[1
]}

Zhang, Liang ^{[1
]}

Zhang, XiangDong ^{[1
]}

Zhu, Guangming ^{[1
]}

Zhang, Hui ^{[2
]}

Wang, Yaonan ^{[2
]}

Mian, Ajmal ^{[3
]}

机构：

[1] Xidian Univ, Xian, Peoples R China

[2] Hunan Univ, Changsha, Peoples R China

[3] Univ Western Australia, Perth, WA, Australia

来源：

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年

关键词：

D O I：

10.1109/ICCV48922.2021.00370

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

3D object grounding aims to locate the most relevant target object in a raw point cloud scene based on a free-form language description. Understanding complex and diverse descriptions, and lifting them directly to a point cloud is a new and challenging topic due to the irregular and sparse nature of point clouds. There are three main challenges in 3D object grounding: to find the main focus in the complex and diverse description; to understand the point cloud scene; and to locate the target object. In this paper, we address all three challenges. Firstly, we propose a language scene graph module to capture the rich structure and long-distance phrase correlations. Secondly, we introduce a multi-level 3D proposal relation graph module to extract the object-object and object-scene co-occurrence relationships, and strengthen the visual features of the initial proposals. Lastly, we develop a description guided 3D visual graph module to encode global contexts of phrases and proposals by a nodes matching strategy. Extensive experiments on challenging benchmark datasets (ScanRefer [3] and Nr3D [42]) show that our algorithm outperforms existing state-of-the-art. Our code is available at https://github.com/PNXD/FFL-3DOG.

引用

页码：3702 / 3711

页数：10

共 50 条

[1] Graph Convolutional Network for 3D Object Pose Estimation in a Point Cloud
Jung, Tae-Won
Jeong, Chi-Seo
Kim, In-Seon
Yu, Min-Su
Kwon, Soon-Chul
Jung, Kye-Dong
SENSORS, 2022, 22 (21)
[2] 3D free-form surface registration and object recognition
Chua, CS
Jarvis, R
INTERNATIONAL JOURNAL OF COMPUTER VISION, 1996, 17 (01) : 77 - 99
[3] Learning Free-Form Deformations for 3D Object Reconstruction
Jack, Dominic
Pontes, Jhony K.
Sridharan, Sridha
Fookes, Clinton
Shirazi, Sareh
Maire, Frederic
Eriksson, Anders
COMPUTER VISION - ACCV 2018, PT II, 2019, 11362 : 317 - 333
[4] Learning Free-Form Deformations for 3D Object Reconstruction
Jack, Dominic
Pontes, Jhony K.
Sridharan, Sridha
Fookes, Clinton
Shirazi, Sareh
Maire, Frederic
Eriksson, Anders
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, 11362 LNCS : 317 - 333
[5] Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud
Shi, Weijing
Rajkumar, Ragunathan
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1708 - 1716
[6] Free-form 3D object reconstruction from range images
Schutz, C
Jost, T
Hugli, H
INTERNATIONAL CONFERENCE ON VIRTUAL SYSTEMS AND MULTIMEDIA - VSMM'97, PROCEEDINGS, 1997, : 69 - 70
[7] Research on laser measurement point cloud preprocessing and 3D reconstruction technology for free-form surfaces
Sun, Bin
Song, Junfang
Cao, Yi
Zhao, Xiaoqian
REVIEW OF SCIENTIFIC INSTRUMENTS, 2024, 95 (11):
[8] Lidar Point Cloud Guided Monocular 3D Object Detection
Peng, Liang
Liu, Fei
Yu, Zhengxu
Yan, Senbo
Deng, Dan
Yang, Zheng
Liu, Haifeng
Cai, Deng
COMPUTER VISION - ECCV 2022, PT I, 2022, 13661 : 123 - 139
[9] Multi-scale free-form 3D object recognition using 3D models
Mokhtarian, F
Khalili, N
Yuen, P
IMAGE AND VISION COMPUTING, 2001, 19 (05) : 271 - 281
[10] New methods for projecting a 3D object onto a free-form surface
Pei, Jingyu
Wang, Xiaoping
Zhang, Leen
Zhou, Yu
Qian, Jinyuan
ENGINEERING COMPUTATIONS, 2021, 38 (02) : 852 - 873

← 1 2 3 4 5 →