Mutual information-driven self-supervised point cloud pre-training

被引:0
|
作者
Xu, Weichen [1 ]
Fu, Tianhao [1 ]
Cao, Jian [1 ]
Zhao, Xinyu [1 ]
Xu, Xinxin [1 ]
Cao, Xixin [1 ]
Zhang, Xing [1 ,2 ]
机构
[1] Peking Univ, Sch Software & Microelect, Beijing 100871, Peoples R China
[2] Peking Univ, Shenzhen Grad Sch, Key Lab Integrated Microsyst, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Self-supervised learning; Autonomous driving; Point cloud scene understanding; Mutual information; High-level features; OPTIMIZATION;
D O I
10.1016/j.knosys.2024.112741
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning universal representations from unlabeled 3D point clouds is essential to improve the generalization and safety of autonomous driving. Generative self-supervised point cloud pre-training with low-level features as pretext tasks is a mainstream paradigm. However, from the perspective of mutual information, this approach is constrained by spatial information and entangled representations. In this study, we propose a generalized generative self-supervised point cloud pre-training framework called GPICTURE. High-level features were used as an additional pretext task to enhance the understanding of semantic information. Considering the varying difficulties caused by the discrimination of voxel features, we designed inter-class and intra-class discrimination-guided masking (I2Mask) to set the masking ratio adaptively. Furthermore, to ensure a hierarchical and stable reconstruction process, centered kernel alignment-guided hierarchical reconstruction and differential-gated progressive learning were employed to control multiple reconstruction tasks. Complete theoretical analyses demonstrated that high-level features can enhance the mutual information between latent features and high-level features, as well as the input point cloud. On Waymo, nuScenes, and SemanticKITTI, we achieved a 75.55% mAP for 3D object detection, 79.7% mIoU for 3D semantic segmentation, and 18.8% mIoU for occupancy prediction. Specifically, with only 50% of the fine-tuning data required, the performance of GPICURE was close to that of training from scratch with 100% of the fine-tuning data. In addition, consistent visualization with downstream tasks and a 57% reduction in weight disparity demonstrated a better fine-tuning starting point. The project page is hosted at https://gpicture-page.github.io/.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Self-Supervised Underwater Image Generation for Underwater Domain Pre-Training
    Wu, Zhiheng
    Wu, Zhengxing
    Chen, Xingyu
    Lu, Yue
    Yu, Junzhi
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 14
  • [22] PointSmile: point self-supervised learning via curriculum mutual information
    Li, Xin
    Wei, Mingqiang
    Chen, Songcan
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (11)
  • [23] Masked Deformation Modeling for Volumetric Brain MRI Self-Supervised Pre-Training
    Lyu, Junyan
    Bartlett, Perry F.
    Nasrallah, Fatima A.
    Tang, Xiaoying
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2025, 44 (03) : 1596 - 1607
  • [24] Self-supervised depth super-resolution with contrastive multiview pre-training
    Qiao, Xin
    Ge, Chenyang
    Zhao, Chaoqiang
    Tosi, Fabio
    Poggi, Matteo
    Mattoccia, Stefano
    NEURAL NETWORKS, 2023, 168 : 223 - 236
  • [25] Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute
    Chen, William
    Chang, Xuankai
    Peng, Yifan
    Ni, Zhaoheng
    Maiti, Soumi
    Watanabe, Shinji
    INTERSPEECH 2023, 2023, : 4404 - 4408
  • [26] Self-supervised pre-training improves fundus image classification for diabetic retinopathy
    Lee, Joohyung
    Lee, Eung-Joo
    REAL-TIME IMAGE PROCESSING AND DEEP LEARNING 2022, 2022, 12102
  • [27] Voice Deepfake Detection Using the Self-Supervised Pre-Training Model HuBERT
    Li, Lanting
    Lu, Tianliang
    Ma, Xingbang
    Yuan, Mengjiao
    Wan, Da
    APPLIED SCIENCES-BASEL, 2023, 13 (14):
  • [28] A SELF-SUPERVISED PRE-TRAINING FRAMEWORK FOR VISION-BASED SEIZURE CLASSIFICATION
    Hou, Jen-Cheng
    McGonigal, Aileen
    Bartolomei, Fabrice
    Thonnat, Monique
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1151 - 1155
  • [29] A debiased self-training framework with graph self-supervised pre-training aided for semi-supervised rumor detection
    Qiao, Yuhan
    Cui, Chaoqun
    Wang, Yiying
    Jia, Caiyan
    NEUROCOMPUTING, 2024, 604
  • [30] Abdominal Organs and Pan-Cancer Segmentation Based on Self-supervised Pre-training and Self-training
    Li, He
    Han, Meng
    Wang, Guotai
    FAST, LOW-RESOURCE, AND ACCURATE ORGAN AND PAN-CANCER SEGMENTATION IN ABDOMEN CT, FLARE 2023, 2024, 14544 : 130 - 142