Self-supervised learning of rotation-invariant 3D point set features using transformer and its self-distillation

被引:0
|
作者
Furuya, Takahiko [1 ]
Chen, Zhoujie [1 ,2 ]
Ohbuchi, Ryutarou [1 ]
Kuang, Zhenzhong [2 ]
机构
[1] Univ Yamanashi, Dept Comp Sci & Engn, 4-3-11 Takeda, Kofu, Yamanashi 4008511, Japan
[2] Hangzhou Dianzi Univ, Sch Comp Sci, Hangzhou 310000, Peoples R China
基金
日本学术振兴会;
关键词
Deep learning; Self-supervised learning; 3D point set; Feature representation; Rotation invariance;
D O I
10.1016/j.cviu.2024.104025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Invariance against rotations of 3D objects is an important property in analyzing 3D point set data. Conventional 3D point set DNNs having rotation invariance typically obtain accurate 3D shape features via supervised learning by using labeled 3D point sets as training samples. However, due to the rapid increase in 3D point set data and the high cost of labeling, a framework to learn rotation -invariant 3D shape features from numerous unlabeled 3D point sets is required. This paper proposes a novel self -supervised learning framework for acquiring accurate and rotation -invariant 3D point set features at object -level. Our proposed lightweight DNN architecture decomposes an input 3D point set into multiple global -scale regions, called tokens, that preserve the spatial layout of partial shapes composing the 3D object. We employ a self -attention mechanism to refine the tokens and aggregate them into an expressive rotation -invariant feature per 3D point set. Our DNN is effectively trained by using pseudo -labels generated by a self -distillation framework. To facilitate the learning of accurate features, we propose to combine multi -crop and cut -mix data augmentation techniques to diversify 3D point sets for training. Through a comprehensive evaluation, we empirically demonstrate that, (1) existing rotation -invariant DNN architectures designed for supervised learning do not necessarily learn accurate 3D shape features under a self -supervised learning scenario, and (2) our proposed algorithm learns rotation -invariant 3D point set features that are more accurate than those learned by existing algorithms.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Self-Supervised Learning and 3D Printing Technology in Facial Reconstruction and Defect Coverage
    Tung, N. T.
    Chau, Nguyen Dong
    Nguyen, Nghi N.
    Nguyen, Thanh Q.
    3D PRINTING AND ADDITIVE MANUFACTURING, 2025,
  • [32] Self-supervised 3D Skeleton Completion for Vascular Structures
    Ren, Jiaxiang
    Li, Zhenghong
    Cheng, Wensheng
    Zou, Zhilin
    Park, Kicheon
    Pan, Yingtian
    Ling, Haibin
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XI, 2024, 15011 : 579 - 589
  • [33] Attention-guided mask learning for self-supervised 3D action recognition
    Zhang, Haoyuan
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (06) : 7487 - 7496
  • [34] Deep Unpaired Blind Image Super-Resolution Using Self-supervised Learning and Exemplar Distillation
    Dong, Jiangxin
    Bai, Haoran
    Tang, Jinhui
    Pan, Jinshan
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023,
  • [35] MS-DINO: Masked Self-Supervised Distributed Learning Using Vision Transformer
    Park, Sangjoon
    Lee, Ik Jae
    Kim, Jun Won
    Ye, Jong Chul
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (10) : 6180 - 6192
  • [36] Self-supervised deep learning for joint 3D low-dose PET/CT image denoising
    Zhao, Feixiang
    Li, Dongfen
    Luo, Rui
    Liu, Mingzhe
    Jiang, Xin
    Hu, Junjie
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 165
  • [37] Contrastive Mask Learning for Self-Supervised 3D Skeleton-Based Action Recognition
    Zhang, Haoyuan
    SENSORS, 2025, 25 (05)
  • [38] SSRL: Self-Supervised Spatial-Temporal Representation Learning for 3D Action Recognition
    Jin, Zhihao
    Wang, Yifan
    Wang, Qicong
    Shen, Yehu
    Meng, Hongying
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (01) : 274 - 285
  • [39] Self-supervised learning for fine-grained monocular 3D face reconstruction in the wild
    Huang, Dongjin
    Shi, Yongsheng
    Liu, Jinhua
    Tang, Wen
    MULTIMEDIA SYSTEMS, 2024, 30 (04)
  • [40] Self-supervised Feature Learning for 3D Medical Images by Playing a Rubik's Cube
    Zhuang, Xinrui
    Li, Yuexiang
    Hu, Yifan
    Ma, Kai
    Yang, Yujiu
    Zheng, Yefeng
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT IV, 2019, 11767 : 420 - 428