PointGL: A Simple Global-Local Framework for Efficient Point Cloud Analysis

被引：5

作者：

Li, Jianan ^{[1
,2
]}

Wang, Jie ^{[1
]}

Xu, Tingfa ^{[1
,2
,3
]}

机构：

[1] Beijing Inst Technol, Beijing 100081, Peoples R China

[2] Minist Educ China, Key Lab Photoelect Imaging Technol & Syst, Beijing 100081, Peoples R China

[3] Beijing Inst Technol, Chongqing Innovat Ctr, Chongqing 401135, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2024年 / 26卷

基金：

中国国家自然科学基金;

关键词：

Point cloud; feature embedding; graph;

D O I：

10.1109/TMM.2024.3358695

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Efficient analysis of point clouds holds paramount significance in real-world 3D applications. Currently, prevailing point-based models adhere to the PointNet++ methodology, which involves embedding and abstracting point features within a sequence of spatially overlapping local point sets, resulting in noticeable computational redundancy. Drawing inspiration from the streamlined paradigm of pixel embedding followed by regional pooling in Convolutional Neural Networks (CNNs), we introduce a novel, uncomplicated yet potent architecture known as PointGL, crafted to facilitate efficient point cloud analysis. PointGL employs a hierarchical process of feature acquisition through two recursive steps. First, the Global Point Embedding leverages straightforward residual Multilayer Perceptrons (MLPs) to effectuate feature embedding for each individual point. Second, the novel Local Graph Pooling technique characterizes point-to-point relationships and abstracts regional representations through succinct local graphs. The harmonious fusion of one-time point embedding and parameter-free graph pooling contributes to PointGL's defining attributes of minimized model complexity and heightened efficiency. Our PointGL attains state-of-the-art accuracy on the ScanObjectNN dataset while exhibiting a runtime that is more than 5 times faster and utilizing only approximately 4% of the FLOPs and 30% of the parameters compared to the recent PointMLP model.

引用

页码：6931 / 6942

页数：12

共 58 条

[1] 3D Semantic Parsing of Large-Scale Indoor Spaces [J].

Armeni, Iro ;

Sener, Ozan ;

Zamir, Amir R. ;

Jiang, Helen ;

Brilakis, Ioannis ;

Fischer, Martin ;

Savarese, Silvio .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1534-1543

[2]

Barbu A, 2019, ADV NEUR IN, V32

[3] PQ-Transformer: Jointly Parsing 3D Objects and Layouts From Point Clouds [J].

Chen, Xiaoxue ;

Zhao, Hao ;

Zhou, Guyue ;

Zhang, Ya-Qin .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) :2519-2526

[4] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[5] Superpoint-guided Semi-supervised Semantic Segmentation of 3D Point Clouds [J].

Deng, Shuang ;

Dong, Qiulei ;

Liu, Bo ;

Hu, Zhanyi .

2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, :9214-9220

[6] Vision meets robotics: The KITTI dataset [J].

Geiger, A. ;

Lenz, P. ;

Stiller, C. ;

Urtasun, R. .

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237

[7]

Goyal A, 2021, PR MACH LEARN RES, V139

[8] Multi-View 3D Object Retrieval With Deep Embedding Network [J].

Guo, Haiyun ;

Wang, Jinqiao ;

Gao, Yue ;

Li, Jianqiang ;

Lu, Hanqing .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (12) :5526-5537

[9] PCT: Point cloud transformer [J].

Guo, Meng-Hao ;

Cai, Jun-Xiong ;

Liu, Zheng-Ning ;

Mu, Tai-Jiang ;

Martin, Ralph R. ;

Hu, Shi-Min .

COMPUTATIONAL VISUAL MEDIA, 2021, 7 (02) :187-199

[10] MVTN: Multi-View Transformation Network for 3D Shape Recognition [J].

Hamdi, Abdullah ;

Giancola, Silvio ;

Ghanem, Bernard .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :1-11

← 1 2 3 4 5 6 →