Mutual information-driven self-supervised point cloud pre-training

被引:0
|
作者
Xu, Weichen [1 ]
Fu, Tianhao [1 ]
Cao, Jian [1 ]
Zhao, Xinyu [1 ]
Xu, Xinxin [1 ]
Cao, Xixin [1 ]
Zhang, Xing [1 ,2 ]
机构
[1] Peking Univ, Sch Software & Microelect, Beijing 100871, Peoples R China
[2] Peking Univ, Shenzhen Grad Sch, Key Lab Integrated Microsyst, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-supervised learning; Autonomous driving; Point cloud scene understanding; Mutual information; High-level features; OPTIMIZATION;
D O I
10.1016/j.knosys.2024.112741
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning universal representations from unlabeled 3D point clouds is essential to improve the generalization and safety of autonomous driving. Generative self-supervised point cloud pre-training with low-level features as pretext tasks is a mainstream paradigm. However, from the perspective of mutual information, this approach is constrained by spatial information and entangled representations. In this study, we propose a generalized generative self-supervised point cloud pre-training framework called GPICTURE. High-level features were used as an additional pretext task to enhance the understanding of semantic information. Considering the varying difficulties caused by the discrimination of voxel features, we designed inter-class and intra-class discrimination-guided masking (I2Mask) to set the masking ratio adaptively. Furthermore, to ensure a hierarchical and stable reconstruction process, centered kernel alignment-guided hierarchical reconstruction and differential-gated progressive learning were employed to control multiple reconstruction tasks. Complete theoretical analyses demonstrated that high-level features can enhance the mutual information between latent features and high-level features, as well as the input point cloud. On Waymo, nuScenes, and SemanticKITTI, we achieved a 75.55% mAP for 3D object detection, 79.7% mIoU for 3D semantic segmentation, and 18.8% mIoU for occupancy prediction. Specifically, with only 50% of the fine-tuning data required, the performance of GPICURE was close to that of training from scratch with 100% of the fine-tuning data. In addition, consistent visualization with downstream tasks and a 57% reduction in weight disparity demonstrated a better fine-tuning starting point. The project page is hosted at https://gpicture-page.github.io/.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Self-supervised ECG pre-training
    Liu, Han
    Zhao, Zhenbo
    She, Qiang
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 70
  • [2] Self-supervised Pre-training of Text Recognizers
    Kiss, Martin
    Hradis, Michal
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT IV, 2024, 14807 : 218 - 235
  • [3] Self-supervised Pre-training for Nuclei Segmentation
    Haq, Mohammad Minhazul
    Huang, Junzhou
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT II, 2022, 13432 : 303 - 313
  • [4] Self-supervised pre-training on industrial time-series
    Biggio, Luca
    Kastanis, Iason
    2021 8TH SWISS CONFERENCE ON DATA SCIENCE, SDS, 2021, : 56 - 57
  • [5] MEASURING THE IMPACT OF DOMAIN FACTORS IN SELF-SUPERVISED PRE-TRAINING
    Sanabria, Ramon
    Wei-Ning, Hsu
    Alexei, Baevski
    Auli, Michael
    2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
  • [6] Contrastive Self-Supervised Pre-Training for Video Quality Assessment
    Chen, Pengfei
    Li, Leida
    Wu, Jinjian
    Dong, Weisheng
    Shi, Guangming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 458 - 471
  • [7] SPAKT: A Self-Supervised Pre-TrAining Method for Knowledge Tracing
    Ma, Yuling
    Han, Peng
    Qiao, Huiyan
    Cui, Chaoran
    Yin, Yilong
    Yu, Dehu
    IEEE ACCESS, 2022, 10 : 72145 - 72154
  • [8] FALL DETECTION USING SELF-SUPERVISED PRE-TRAINING MODEL
    Yhdego, Haben
    Audette, Michel
    Paolini, Christopher
    PROCEEDINGS OF THE 2022 ANNUAL MODELING AND SIMULATION CONFERENCE (ANNSIM'22), 2022, : 361 - 371
  • [9] A Unified Visual Information Preservation Framework for Self-supervised Pre-Training in Medical Image Analysis
    Zhou, Hong-Yu
    Lu, Chixiang
    Chen, Chaoqi
    Yang, Sibei
    Yu, Yizhou
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (07) : 8020 - 8035
  • [10] ENHANCING THE DOMAIN ROBUSTNESS OF SELF-SUPERVISED PRE-TRAINING WITH SYNTHETIC IMAGES
    Hassan, Mohamad N. C.
    Bhattacharya, Avigyan
    da Costa, Victor G. Turrisi
    Banerjee, Biplab
    Ricci, Elisa
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5470 - 5474