Curriculumformer: Taming Curriculum Pre-Training for Enhanced 3-D Point Cloud Understanding

被引：0

作者：

Fei, Ben ^{[1
]}

Luo, Tianyue ^{[1
]}

Yang, Weidong ^{[1
]}

Liu, Liwen ^{[1
]}

Zhang, Rui ^{[1
]}

He, Ying ^{[2
]}

机构：

[1] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China

[2] Nanyang Technol Univ, Coll Comp & Data Sci, Singapore 639798, Singapore

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年

基金：

中国国家自然科学基金;

关键词：

Point cloud compression; Transformers; Task analysis; Representation learning; Geometry; Data models; Accuracy; 3-D representation learning; curriculum learning; point clouds; self-supervised learning; transformer;

D O I：

10.1109/TNNLS.2024.3406587

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Learning universal representations of 3-D point clouds is essential for reducing the need for manual annotation of large-scale and irregular point cloud datasets. The current modus operandi for representative learning is self-supervised learning, which has shown great potential for improving point cloud understanding. Nevertheless, it remains an open problem how to employ auto-encoding for learning universal 3-D representations of irregularly structured point clouds, as previous methods focus on either global shapes or local geometries. To this end, we present a cascaded self-supervised point cloud representation learning framework, dubbed Curriculumformer, aiming to tame curriculum pre-training for enhanced point cloud understanding. Our main idea lies in devising a progressive pre-training strategy, which trains the Transformer in an easy-to-hard manner. Specifically, we first pre-train the Transformer using an upsampling strategy, which allows it to learn global information. Then, we follow up with a completion strategy, which enables the Transformer to gain insight into local geometries. Finally, we propose a Multi-Modal Multi-Modality Contrastive Learning (M4CL) strategy to enhance the ability of representation learning by enriching the Transformer with semantic information. In this way, the pre-trained Transformer can be easily transferred to a wide range of downstream applications. We demonstrate the superior performance of Curriculumformer on various discriminant and generative tasks, outperforming state-of-the-art methods. Moreover, Curriculumformer can also be integrated into other off-the-shelf methods to promote their performance. Our code is available at https://github.com/Fayeben/Curriculumformer.

引用

页码：1 / 15

页数：15

共 50 条

[31] Fuzzy Neighborhood Learning for Deep 3-D Segmentation of Point Cloud
Zhong, Mingyang
Li, Chaojie
Liu, Liangchen
Wen, Jiahui
Ma, Jingwei
Yu, Xinghuo
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (12) : 3181 - 3192
[32] InOR-Net: Incremental 3-D Object Recognition Network for Point Cloud Representation
Dong, Jiahua
Cong, Yang
Sun, Gan
Wang, Lixu
Lyu, Lingjuan
Li, Jun
Konukoglu, Ender
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (10) : 6955 - 6967
[33] Deep Supervised Descent Method With Multiple Seeds Generation for 3-D Tracking in Point Cloud
Tian, Shengjing
Liu, Bin
Tan, Hongchen
Liu, Jun
Liu, Meng
Liu, Xiuping
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (08) : 5077 - 5086
[34] Unsupervised Domain Adaptive 3-D Detection With Data Adaption From LiDAR Point Cloud
Zhang, Diankun
Wang, Xueqing
Zheng, Zhijie
Liu, Xiaojun
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[35] ReFs: A hybrid pre-training paradigm for 3D medical image segmentation
Xie, Yutong
Zhang, Jianpeng
Liu, Lingqiao
Wang, Hu
Ye, Yiwen
Verjans, Johan
Xia, Yong
MEDICAL IMAGE ANALYSIS, 2024, 91
[36] DLAFNet: Direct LiDAR-Aerial Fusion Network for Semantic Segmentation of 2-D Aerial Image and 3-D LiDAR Point Cloud
Liu, Wei
Wang, He
Qiao, Yicheng
Zhang, Haopeng
Yang, Junli
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 1864 - 1875
[37] 3-D Object Detection for Multiframe 4-D Automotive Millimeter-Wave Radar Point Cloud
Tan, Bin
Ma, Zhixiong
Zhu, Xichan
Li, Sen
Zheng, Lianqing
Chen, Sihan
Huang, Libo
Bai, Jie
IEEE SENSORS JOURNAL, 2023, 23 (11) : 11125 - 11138
[38] Multi-Camera Unified Pre-Training via 3D Scene Reconstruction
Min, Chen
Xiao, Liang
Zhao, Dawei
Nie, Yiming
Dai, Bin
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (04) : 3243 - 3250
[39] Compatibility-Guided Sampling Consensus for 3-D Point Cloud Registration
Quan, Siwen
Yang, Jiaqi
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (10): : 7380 - 7392
[40] Change Detection Needs Change Information: Improving Deep 3-D Point Cloud Change Detection
de Gelis, Iris
Corpetti, Thomas
Lefevre, Sebastien
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 10

← 1 2 3 4 5 →