Curriculumformer: Taming Curriculum Pre-Training for Enhanced 3-D Point Cloud Understanding

被引:0
作者
Fei, Ben [1 ]
Luo, Tianyue [1 ]
Yang, Weidong [1 ]
Liu, Liwen [1 ]
Zhang, Rui [1 ]
He, Ying [2 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China
[2] Nanyang Technol Univ, Coll Comp & Data Sci, Singapore 639798, Singapore
基金
中国国家自然科学基金;
关键词
Point cloud compression; Transformers; Task analysis; Representation learning; Geometry; Data models; Accuracy; 3-D representation learning; curriculum learning; point clouds; self-supervised learning; transformer;
D O I
10.1109/TNNLS.2024.3406587
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning universal representations of 3-D point clouds is essential for reducing the need for manual annotation of large-scale and irregular point cloud datasets. The current modus operandi for representative learning is self-supervised learning, which has shown great potential for improving point cloud understanding. Nevertheless, it remains an open problem how to employ auto-encoding for learning universal 3-D representations of irregularly structured point clouds, as previous methods focus on either global shapes or local geometries. To this end, we present a cascaded self-supervised point cloud representation learning framework, dubbed Curriculumformer, aiming to tame curriculum pre-training for enhanced point cloud understanding. Our main idea lies in devising a progressive pre-training strategy, which trains the Transformer in an easy-to-hard manner. Specifically, we first pre-train the Transformer using an upsampling strategy, which allows it to learn global information. Then, we follow up with a completion strategy, which enables the Transformer to gain insight into local geometries. Finally, we propose a Multi-Modal Multi-Modality Contrastive Learning (M4CL) strategy to enhance the ability of representation learning by enriching the Transformer with semantic information. In this way, the pre-trained Transformer can be easily transferred to a wide range of downstream applications. We demonstrate the superior performance of Curriculumformer on various discriminant and generative tasks, outperforming state-of-the-art methods. Moreover, Curriculumformer can also be integrated into other off-the-shelf methods to promote their performance. Our code is available at https://github.com/Fayeben/Curriculumformer.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 50 条
  • [41] MUAN: Multiscale Upsampling Aggregation Network for 3-D Point Cloud Segmentation
    Dai, Jiaxi
    Zhang, Youbing
    Bi, Dong
    Lan, Jianping
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [42] A review of point cloud segmentation for understanding 3D indoor scenes
    Yuliang Sun
    Xudong Zhang
    Yongwei Miao
    Visual Intelligence, 2 (1):
  • [43] Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences
    Wang, Guangming
    Liu, Hanwen
    Chen, Muyao
    Yang, Yehui
    Liu, Zhe
    Wang, Hesheng
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
  • [44] Pseudo-Reference Point Cloud Quality Measurement Based on Joint 2-D and 3-D Distortion Description
    Tu, Renwei
    Jiang, Gangyi
    Yu, Mei
    Zhang, Yun
    Luo, Ting
    Zhu, Zhongjie
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [45] Efficient Global Navigational Planning in 3-D Structures Based on Point Cloud Tomography
    Yang, Bowen
    Cheng, Jie
    Xue, Bohuan
    Jiao, Jianhao
    Liu, Ming
    IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2025, 30 (01) : 321 - 332
  • [46] A Comprehensive Study and Comparison of Core Technologies for MPEG 3-D Point Cloud Compression
    Liu, Hao
    Yuan, Hui
    Liu, Qi
    Hou, Junhui
    Liu, Ju
    IEEE TRANSACTIONS ON BROADCASTING, 2020, 66 (03) : 701 - 717
  • [47] Dual-Graph Attention Convolution Network for 3-D Point Cloud Classification
    Huang, Chang-Qin
    Jiang, Fan
    Huang, Qiong-Hao
    Wang, Xi-Zhe
    Han, Zhong-Mei
    Huang, Wei-Yu
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 4813 - 4825
  • [48] PGN3DCD: Prior-Knowledge-Guided Network for Urban 3-D Point Cloud Change Detection
    Zhan, Wenxiao
    Cheng, Ruozhen
    Chen, Jing
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [49] Transformer Enhanced Hierarchical 3D Point Cloud Semantic Segmentation
    Liu, Yaohua
    Ma, Yue
    Xu, Min
    2ND INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS, MODELLING, AND INTELLIGENT COMPUTING (CAMMIC 2022), 2022, 12259
  • [50] Enhanced 3D Point Cloud from a Light Field Image
    Farhood, Helia
    Perry, Stuart
    Cheng, Eva
    Kim, Juno
    REMOTE SENSING, 2020, 12 (07)