Point Clouds are Specialized Images: A Knowledge Transfer Approach for 3D Understanding

被引:0
|
作者
Kang, Jiachen [1 ]
Jia, Wenjing [1 ]
He, Xiangjian [2 ]
Lam, Kin Man [3 ]
机构
[1] Univ Technol Sydney, Sch Elect & Data Engn, Sydney, NSW 2007, Australia
[2] Univ Nottingham Ningbo, Sch Comp Sci, Ningbo 315100, Peoples R China
[3] Hong Kong Polytech Univ, Dept Elect & Elect Engn, Kowloon, Hong Kong, Peoples R China
关键词
Point cloud compression; Three-dimensional displays; Transformers; Task analysis; Data models; Image coding; Knowledge transfer; Cross-modal learning; point cloud understanding; self-supervision; transfer learning;
D O I
10.1109/TMM.2024.3412330
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Self-supervised representation learning (SSRL) has gained increasing attention in point cloud understanding, in addressing the challenges posed by 3D data scarcity and high annotation costs. This paper presents PCExpert, a novel SSRL approach that reinterprets point clouds as "specialized images". This conceptual shift allows PCExpert to leverage knowledge derived from large-scale image modality in a more direct and deeper manner, via extensively sharing the parameters with a pre-trained image encoder in a multi-way Transformer architecture. The parameter sharing strategy, combined with an additional pretext task for pre-training, i.e., transformation estimation, empowers PCExpert to outperform the state of the arts in a variety of tasks, with a remarkable reduction in the number of trainable parameters. Notably, PCExpert's performance under LINEAR fine-tuning (e.g., yielding a 90.02% overall accuracy on ScanObjectNN) has already closely approximated the results obtained with FULL model fine-tuning (92.66%), demonstrating its effective representation capability.
引用
收藏
页码:10755 / 10765
页数:11
相关论文
共 50 条
  • [41] Convolutional Neural Network for Extracting 3D Point Clouds of Fibrous Web From Multi-Focus Images
    Hou, Jue
    Ouyang, Wenbin
    Xu, Bugao
    Wang, Rongwu
    IEEE ACCESS, 2020, 8 : 87857 - 87869
  • [42] Chat3D: Interactive understanding 3D scene-level point clouds by chatting with foundation model for urban ecological construction
    Chen, Yiping
    Zhang, Shuai
    Han, Ting
    Du, Yumeng
    Zhang, Wuming
    Li, Jonathan
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2024, 212 : 181 - 192
  • [43] Exploring Self-Supervised Learning for 3D Point Cloud Registration
    Yuan, Mingzhi
    Huang, Qiao
    Shen, Ao
    Huang, Xiaoshui
    Wang, Manning
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (01): : 25 - 31
  • [44] 3D Directional Encoding for Point Cloud Analysis
    Jung, Yoonjae
    Lee, Sang-Hyun
    Seo, Seung-Woo
    IEEE ACCESS, 2024, 12 : 144533 - 144543
  • [45] Leveraging Single-View Images for Unsupervised 3D Point Cloud Completion
    Wu, Lintai
    Zhang, Qijian
    Hou, Junhui
    Xu, Yong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 940 - 953
  • [46] PatchMixer: Rethinking network design to boost generalization for 3D point cloud understanding
    Boscaini, Davide
    Poiesi, Fabio
    IMAGE AND VISION COMPUTING, 2023, 137
  • [47] 3D detection transformer: Set prediction of objects using point clouds
    Thon, Tan
    Lim, Joanne Mun-Yee
    Jinn, Foo Ji
    Muniandy, Ramachandran
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 236
  • [48] Enhancement Layer Inter Frame Coding for 3D Dynamic Point Clouds
    Subramanyam, Shishir
    Cesar, Pablo
    2018 IEEE GAMES, ENTERTAINMENT, MEDIA CONFERENCE (GEM), 2018, : 332 - 337
  • [49] Deep Learning Classification of 2D Orthomosaic Images and 3D Point Clouds for Post-Event Structural Damage Assessment
    Liao, Yijun
    Mohammadi, Mohammad Ebrahim
    Wood, Richard L.
    DRONES, 2020, 4 (02) : 1 - 19
  • [50] Layered Projection-Based Quality Assessment of 3D Point Clouds
    Chen, Tianxin
    Long, Chunyi
    Su, Honglei
    Chen, Lijun
    Chi, Jieru
    Pan, Zhenkuan
    Yang, Huan
    Liu, Yuxin
    IEEE ACCESS, 2021, 9 : 88108 - 88120