Masked Structural Point Cloud Modeling to Learning 3D Representation

被引：0

作者：

Yamada, Ryosuke ^{[1
,2
]}

Tadokoro, Ryu ^{[2
]}

Qiu, Yue ^{[2
]}

Kataoka, Hirokatsu ^{[2
]}

Satoh, Yutaka ^{[1
,2
]}

机构：

[1] Univ Tsukuba, Grad Sch Sci & Technol, Tsukuba 3058577, Japan

[2] Natl Inst Adv Ind Sci & Technol, Tsukuba 3058560, Japan

来源：

IEEE ACCESS | 2024年 / 12卷

基金：

日本学术振兴会;

关键词：

Deep learning; computer vision; 3D object recognition; point cloud; transfer learning; self-supervised learning; formula-driven supervised learning; NETWORK;

D O I：

10.1109/ACCESS.2024.3470971

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Pre-training for 3D object recognition typically requires a large-scale 3D dataset to learn effective 3D geometric representations. However, constructing such datasets is costly due to the extensive 3D data collection and human annotation required. This paper explores a synthetic pre-training approach that learns 3D geometric representations by reconstructing structural point clouds without relying on real data or human annotation. We propose the Point Cloud Perlin Noise (PCPN) dataset, which is an automatically generated point cloud dataset that uses Perlin noise to simulate natural 3D structures found in the real world. The proposed method enables the rapid generation of diverse 3D geometric patterns using a simple Perlin noise-based formula, significantly reducing the human effort typically involved in creating conventional 3D datasets. We applied PointMAE to the PCPN dataset for pre-training, demonstrating improved performance in downstream tasks such as 3D shape classification and part segmentation. Our experiments showed that the proposed pre-trained model outperformed a model trained from scratch on ModelNet40 by 1.4%. In addition, our pre-training strategy proves effective for 3D object recognition without requiring real data or supervised labels. This study highlights that Perlin noise can capture 3D structural properties and that the diversity of geometric patterns is crucial for learning effective 3D geometric representations.

引用

页码：142291 / 142305

页数：15

共 54 条

[11]

Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929

[12] Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative Models [J].

Eckart, Benjamin ;

Yuan, Wentao ;

Liu, Chao ;

Kautz, Jan .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :8244-8253

[13]

Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074

[14] Momentum Contrast for Unsupervised Visual Representation Learning [J].

He, Kaiming ;

Fan, Haoqi ;

Wu, Yuxin ;

Xie, Saining ;

Girshick, Ross .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :9726-9735

[15] Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts [J].

Hou, Ji ;

Graham, Benjamin ;

Niesner, Matthias ;

Xie, Saining .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :15582-15592

[16] A Style-Based Generator Architecture for Generative Adversarial Networks [J].

Karras, Tero ;

Laine, Samuli ;

Aila, Timo .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4396-4405

[17] Spatiotemporal Initialization for 3D CNNs with Generated Motion Patterns [J].

Kataoka, Hirokatsu ;

Hara, Kensho ;

Hayashi, Ryusuke ;

Yamagata, Eisuke ;

Inoue, Nakamasa .

2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, :737-746

[18]

Kornblith S, 2019, PR MACH LEARN RES, V97

[19] SO-Net: Self-Organizing Network for Point Cloud Analysis [J].

Li, Jiaxin ;

Chen, Ben M. ;

Lee, Gim Hee .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :9397-9406

[20]

Li YY, 2018, ADV NEUR IN, V31

← 1 2 3 4 5 6 →