Curriculumformer: Taming Curriculum Pre-Training for Enhanced 3-D Point Cloud Understanding

被引:0
|
作者
Fei, Ben [1 ]
Luo, Tianyue [1 ]
Yang, Weidong [1 ]
Liu, Liwen [1 ]
Zhang, Rui [1 ]
He, Ying [2 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China
[2] Nanyang Technol Univ, Coll Comp & Data Sci, Singapore 639798, Singapore
基金
中国国家自然科学基金;
关键词
Point cloud compression; Transformers; Task analysis; Representation learning; Geometry; Data models; Accuracy; 3-D representation learning; curriculum learning; point clouds; self-supervised learning; transformer;
D O I
10.1109/TNNLS.2024.3406587
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning universal representations of 3-D point clouds is essential for reducing the need for manual annotation of large-scale and irregular point cloud datasets. The current modus operandi for representative learning is self-supervised learning, which has shown great potential for improving point cloud understanding. Nevertheless, it remains an open problem how to employ auto-encoding for learning universal 3-D representations of irregularly structured point clouds, as previous methods focus on either global shapes or local geometries. To this end, we present a cascaded self-supervised point cloud representation learning framework, dubbed Curriculumformer, aiming to tame curriculum pre-training for enhanced point cloud understanding. Our main idea lies in devising a progressive pre-training strategy, which trains the Transformer in an easy-to-hard manner. Specifically, we first pre-train the Transformer using an upsampling strategy, which allows it to learn global information. Then, we follow up with a completion strategy, which enables the Transformer to gain insight into local geometries. Finally, we propose a Multi-Modal Multi-Modality Contrastive Learning (M4CL) strategy to enhance the ability of representation learning by enriching the Transformer with semantic information. In this way, the pre-trained Transformer can be easily transferred to a wide range of downstream applications. We demonstrate the superior performance of Curriculumformer on various discriminant and generative tasks, outperforming state-of-the-art methods. Moreover, Curriculumformer can also be integrated into other off-the-shelf methods to promote their performance. Our code is available at https://github.com/Fayeben/Curriculumformer.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 50 条
  • [1] Point-LGMask: Local and Global Contexts Embedding for Point Cloud Pre-Training With Multi-Ratio Masking
    Tang, Yuan
    Li, Xianzhi
    Xu, Jinfeng
    Yu, Qiao
    Hu, Long
    Hao, Yixue
    Chen, Min
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8360 - 8370
  • [2] Self-Supervised Pre-Training for 3-D Roof Reconstruction on LiDAR Data
    Yang, Hongxin
    Huang, Shangfeng
    Wang, Ruisheng
    Wang, Xin
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [3] Learnable Query Contrast and Spatio-temporal Prediction on Point Cloud Video Pre-training
    Sheng, Xiaoxiao
    Shen, Zhiqiang
    Wang, Longguang
    Xiao, Gang
    IEEE LATIN AMERICA TRANSACTIONS, 2024, 22 (10) : 821 - 828
  • [4] Schema dependency-enhanced curriculum pre-training for table semantic parsing
    Qin, Bowen
    Hui, Binyuan
    Wang, Lihan
    Yang, Min
    Li, Binhua
    Huang, Fei
    Si, Luo
    Jiang, Qingshan
    Li, Yongbin
    KNOWLEDGE-BASED SYSTEMS, 2023, 262
  • [5] Self-Training Enhanced Multitask Network for 3-D Point-Level Hybrid Scene Understanding for Autonomous Vehicles
    Li, Bing-He
    Lu, Ching-Hu
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (19): : 31394 - 31406
  • [6] Unsupervised Pre-Training for 3D Leaf Instance Segmentation
    Roggiolani, Gianmarco
    Magistri, Federico
    Guadagnino, Tiziano
    Behley, Jens
    Stachniss, Cyrill
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (11) : 7448 - 7455
  • [7] Rotation-Invariant Point Cloud Representation for 3-D Model Recognition
    Wang, Yan
    Zhao, Yining
    Ying, Shihui
    Du, Shaoyi
    Gao, Yue
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (10) : 10948 - 10956
  • [8] Mutual information-driven self-supervised point cloud pre-training
    Xu, Weichen
    Fu, Tianhao
    Cao, Jian
    Zhao, Xinyu
    Xu, Xinxin
    Cao, Xixin
    Zhang, Xing
    KNOWLEDGE-BASED SYSTEMS, 2025, 307
  • [9] 3DGTN: 3-D Dual-Attention GLocal Transformer Network for Point Cloud Classification and Segmentation
    Lu, Dening
    Gao, Kyle
    Xie, Qian
    Xu, Linlin
    Li, Jonathan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 13
  • [10] LSLPCT: An Enhanced Local Semantic Learning Transformer for 3-D Point Cloud Analysis
    Song, Yupeng
    He, Fazhi
    Duan, Yansong
    Si, Tongzhen
    Bai, Junwei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60