Point Clouds are Specialized Images: A Knowledge Transfer Approach for 3D Understanding

被引：0

作者：

Kang, Jiachen ^{[1
]}

Jia, Wenjing ^{[1
]}

He, Xiangjian ^{[2
]}

Lam, Kin Man ^{[3
]}

机构：

[1] Univ Technol Sydney, Sch Elect & Data Engn, Sydney, NSW 2007, Australia

[2] Univ Nottingham Ningbo, Sch Comp Sci, Ningbo 315100, Peoples R China

[3] Hong Kong Polytech Univ, Dept Elect & Elect Engn, Kowloon, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2024年 / 26卷

关键词：

Point cloud compression; Three-dimensional displays; Transformers; Task analysis; Data models; Image coding; Knowledge transfer; Cross-modal learning; point cloud understanding; self-supervision; transfer learning;

D O I：

10.1109/TMM.2024.3412330

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Self-supervised representation learning (SSRL) has gained increasing attention in point cloud understanding, in addressing the challenges posed by 3D data scarcity and high annotation costs. This paper presents PCExpert, a novel SSRL approach that reinterprets point clouds as "specialized images". This conceptual shift allows PCExpert to leverage knowledge derived from large-scale image modality in a more direct and deeper manner, via extensively sharing the parameters with a pre-trained image encoder in a multi-way Transformer architecture. The parameter sharing strategy, combined with an additional pretext task for pre-training, i.e., transformation estimation, empowers PCExpert to outperform the state of the arts in a variety of tasks, with a remarkable reduction in the number of trainable parameters. Notably, PCExpert's performance under LINEAR fine-tuning (e.g., yielding a 90.02% overall accuracy on ScanObjectNN) has already closely approximated the results obtained with FULL model fine-tuning (92.66%), demonstrating its effective representation capability.

引用

页码：10755 / 10765

页数：11

共 50 条

[21] OST: Efficient One-Stream Network for 3D Single Object Tracking in Point Clouds
Zhao, Xiantong
Han, Yinan
Tian, Shengjing
Liu, Jian
Liu, Xiuping
IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 990 - 1002
[22] Point Transformer-Based Salient Object Detection Network for 3-D Measurement Point Clouds
Wei, Zeyong
Chen, Baian
Wang, Weiming
Chen, Honghua
Wei, Mingqiang
Li, Jonathan
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 11
[23] Swin3D: A pretrained transformer backbone for 3D indoor scene understanding
Yang, Yu-Qi
Guo, Yu-Xiao
Xiong, Jian-Yu
Liu, Yang
Pan, Hao
Wang, Peng-Shuai
Tong, Xin
Guo, Baining
COMPUTATIONAL VISUAL MEDIA, 2025, 11 (01): : 83 - 101
[24] MD3D: Mixture-Density-Based 3D Object Detection in Point Clouds
Choi, Jaeseok
Song, Yeji
Kim, Yerim
Yoo, Jaeyoung
Kwak, Nojun
IEEE ACCESS, 2022, 10 : 104011 - 104022
[25] Domain adaptation learning for 3D point clouds：A survey
Fan W.
Lin X.
Luo H.
Guo W.
Wang H.
Dai C.
National Remote Sensing Bulletin, 2024, 28 (04) : 825 - 842
[26] General Hypernetwork Framework for Creating 3D Point Clouds
Spurek, Przemyslaw
Zieba, Maciej
Tabor, Jacek
Trzcinski, Tomasz
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9995 - 10008
[27] 3D Model Retrieval Based on a 3D Shape Knowledge Graph
Nie, Weizhi
Wang, Ya
Song, Dan
Li, Wenhui
IEEE ACCESS, 2020, 8 : 142632 - 142641
[28] Masked Autoencoders in 3D Point Cloud Representation Learning
Jiang, Jincen
Lu, Xuequan
Zhao, Lizhi
Dazeley, Richard
Wang, Meili
IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 820 - 831
[29] SimLOG: Simultaneous Local-Global Feature Learning for 3D Object Detection in Indoor Point Clouds
Wei, Mingqiang
Chen, Baian
Nan, Liangliang
Xie, Haoran
Gu, Lipeng
Lu, Dening
Wang, Fu Lee
Li, Qing
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, : 19482 - 19495
[30] Rethinking Masked Representation Learning for 3D Point Cloud Understanding
Wang, Chuxin
Zha, Yixin
He, Jianfeng
Yang, Wenfei
Zhang, Tianzhu
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 247 - 262

← 1 2 3 4 5 →