Point Clouds are Specialized Images: A Knowledge Transfer Approach for 3D Understanding

被引:0
|
作者
Kang, Jiachen [1 ]
Jia, Wenjing [1 ]
He, Xiangjian [2 ]
Lam, Kin Man [3 ]
机构
[1] Univ Technol Sydney, Sch Elect & Data Engn, Sydney, NSW 2007, Australia
[2] Univ Nottingham Ningbo, Sch Comp Sci, Ningbo 315100, Peoples R China
[3] Hong Kong Polytech Univ, Dept Elect & Elect Engn, Kowloon, Hong Kong, Peoples R China
关键词
Point cloud compression; Three-dimensional displays; Transformers; Task analysis; Data models; Image coding; Knowledge transfer; Cross-modal learning; point cloud understanding; self-supervision; transfer learning;
D O I
10.1109/TMM.2024.3412330
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Self-supervised representation learning (SSRL) has gained increasing attention in point cloud understanding, in addressing the challenges posed by 3D data scarcity and high annotation costs. This paper presents PCExpert, a novel SSRL approach that reinterprets point clouds as "specialized images". This conceptual shift allows PCExpert to leverage knowledge derived from large-scale image modality in a more direct and deeper manner, via extensively sharing the parameters with a pre-trained image encoder in a multi-way Transformer architecture. The parameter sharing strategy, combined with an additional pretext task for pre-training, i.e., transformation estimation, empowers PCExpert to outperform the state of the arts in a variety of tasks, with a remarkable reduction in the number of trainable parameters. Notably, PCExpert's performance under LINEAR fine-tuning (e.g., yielding a 90.02% overall accuracy on ScanObjectNN) has already closely approximated the results obtained with FULL model fine-tuning (92.66%), demonstrating its effective representation capability.
引用
收藏
页码:10755 / 10765
页数:11
相关论文
共 50 条
  • [21] OST: Efficient One-Stream Network for 3D Single Object Tracking in Point Clouds
    Zhao, Xiantong
    Han, Yinan
    Tian, Shengjing
    Liu, Jian
    Liu, Xiuping
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 990 - 1002
  • [22] Point Transformer-Based Salient Object Detection Network for 3-D Measurement Point Clouds
    Wei, Zeyong
    Chen, Baian
    Wang, Weiming
    Chen, Honghua
    Wei, Mingqiang
    Li, Jonathan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 11
  • [23] Swin3D: A pretrained transformer backbone for 3D indoor scene understanding
    Yang, Yu-Qi
    Guo, Yu-Xiao
    Xiong, Jian-Yu
    Liu, Yang
    Pan, Hao
    Wang, Peng-Shuai
    Tong, Xin
    Guo, Baining
    COMPUTATIONAL VISUAL MEDIA, 2025, 11 (01): : 83 - 101
  • [24] MD3D: Mixture-Density-Based 3D Object Detection in Point Clouds
    Choi, Jaeseok
    Song, Yeji
    Kim, Yerim
    Yoo, Jaeyoung
    Kwak, Nojun
    IEEE ACCESS, 2022, 10 : 104011 - 104022
  • [25] Domain adaptation learning for 3D point clouds:A survey
    Fan W.
    Lin X.
    Luo H.
    Guo W.
    Wang H.
    Dai C.
    National Remote Sensing Bulletin, 2024, 28 (04) : 825 - 842
  • [26] General Hypernetwork Framework for Creating 3D Point Clouds
    Spurek, Przemyslaw
    Zieba, Maciej
    Tabor, Jacek
    Trzcinski, Tomasz
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9995 - 10008
  • [27] 3D Model Retrieval Based on a 3D Shape Knowledge Graph
    Nie, Weizhi
    Wang, Ya
    Song, Dan
    Li, Wenhui
    IEEE ACCESS, 2020, 8 : 142632 - 142641
  • [28] Masked Autoencoders in 3D Point Cloud Representation Learning
    Jiang, Jincen
    Lu, Xuequan
    Zhao, Lizhi
    Dazeley, Richard
    Wang, Meili
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 820 - 831
  • [29] SimLOG: Simultaneous Local-Global Feature Learning for 3D Object Detection in Indoor Point Clouds
    Wei, Mingqiang
    Chen, Baian
    Nan, Liangliang
    Xie, Haoran
    Gu, Lipeng
    Lu, Dening
    Wang, Fu Lee
    Li, Qing
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, : 19482 - 19495
  • [30] Rethinking Masked Representation Learning for 3D Point Cloud Understanding
    Wang, Chuxin
    Zha, Yixin
    He, Jianfeng
    Yang, Wenfei
    Zhang, Tianzhu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 247 - 262