Curriculumformer: Taming Curriculum Pre-Training for Enhanced 3-D Point Cloud Understanding

被引：0

作者：

Fei, Ben ^{[1
]}

Luo, Tianyue ^{[1
]}

Yang, Weidong ^{[1
]}

Liu, Liwen ^{[1
]}

Zhang, Rui ^{[1
]}

He, Ying ^{[2
]}

机构：

[1] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China

[2] Nanyang Technol Univ, Coll Comp & Data Sci, Singapore 639798, Singapore

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年

基金：

中国国家自然科学基金;

关键词：

Point cloud compression; Transformers; Task analysis; Representation learning; Geometry; Data models; Accuracy; 3-D representation learning; curriculum learning; point clouds; self-supervised learning; transformer;

D O I：

10.1109/TNNLS.2024.3406587

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Learning universal representations of 3-D point clouds is essential for reducing the need for manual annotation of large-scale and irregular point cloud datasets. The current modus operandi for representative learning is self-supervised learning, which has shown great potential for improving point cloud understanding. Nevertheless, it remains an open problem how to employ auto-encoding for learning universal 3-D representations of irregularly structured point clouds, as previous methods focus on either global shapes or local geometries. To this end, we present a cascaded self-supervised point cloud representation learning framework, dubbed Curriculumformer, aiming to tame curriculum pre-training for enhanced point cloud understanding. Our main idea lies in devising a progressive pre-training strategy, which trains the Transformer in an easy-to-hard manner. Specifically, we first pre-train the Transformer using an upsampling strategy, which allows it to learn global information. Then, we follow up with a completion strategy, which enables the Transformer to gain insight into local geometries. Finally, we propose a Multi-Modal Multi-Modality Contrastive Learning (M4CL) strategy to enhance the ability of representation learning by enriching the Transformer with semantic information. In this way, the pre-trained Transformer can be easily transferred to a wide range of downstream applications. We demonstrate the superior performance of Curriculumformer on various discriminant and generative tasks, outperforming state-of-the-art methods. Moreover, Curriculumformer can also be integrated into other off-the-shelf methods to promote their performance. Our code is available at https://github.com/Fayeben/Curriculumformer.

引用

页码：1 / 15

页数：15

共 50 条

[41] MUAN: Multiscale Upsampling Aggregation Network for 3-D Point Cloud Segmentation
Dai, Jiaxi
Zhang, Youbing
Bi, Dong
Lan, Jianping
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[42] A review of point cloud segmentation for understanding 3D indoor scenes
Yuliang Sun
Xudong Zhang
Yongwei Miao
Visual Intelligence, 2 (1):
[43] Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences
Wang, Guangming
Liu, Hanwen
Chen, Muyao
Yang, Yehui
Liu, Zhe
Wang, Hesheng
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
[44] Pseudo-Reference Point Cloud Quality Measurement Based on Joint 2-D and 3-D Distortion Description
Tu, Renwei
Jiang, Gangyi
Yu, Mei
Zhang, Yun
Luo, Ting
Zhu, Zhongjie
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[45] Efficient Global Navigational Planning in 3-D Structures Based on Point Cloud Tomography
Yang, Bowen
Cheng, Jie
Xue, Bohuan
Jiao, Jianhao
Liu, Ming
IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2025, 30 (01) : 321 - 332
[46] A Comprehensive Study and Comparison of Core Technologies for MPEG 3-D Point Cloud Compression
Liu, Hao
Yuan, Hui
Liu, Qi
Hou, Junhui
Liu, Ju
IEEE TRANSACTIONS ON BROADCASTING, 2020, 66 (03) : 701 - 717
[47] Dual-Graph Attention Convolution Network for 3-D Point Cloud Classification
Huang, Chang-Qin
Jiang, Fan
Huang, Qiong-Hao
Wang, Xi-Zhe
Han, Zhong-Mei
Huang, Wei-Yu
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 4813 - 4825
[48] PGN3DCD: Prior-Knowledge-Guided Network for Urban 3-D Point Cloud Change Detection
Zhan, Wenxiao
Cheng, Ruozhen
Chen, Jing
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
[49] Transformer Enhanced Hierarchical 3D Point Cloud Semantic Segmentation
Liu, Yaohua
Ma, Yue
Xu, Min
2ND INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS, MODELLING, AND INTELLIGENT COMPUTING (CAMMIC 2022), 2022, 12259
[50] Enhanced 3D Point Cloud from a Light Field Image
Farhood, Helia
Perry, Stuart
Cheng, Eva
Kim, Juno
REMOTE SENSING, 2020, 12 (07)

← 1 2 3 4 5 →