DCCN: A dual-cross contrastive neural network for 3D point cloud representation learning

被引：5

作者：

Wu, Xiaopeng ^{[1
]}

Shi, Guangsi ^{[2
]}

Zhao, Zexing ^{[1
]}

Li, Mingjie ^{[3
]}

Gao, Xiaojun ^{[1
]}

Yan, Xiaoli ^{[1
]}

机构：

[1] Northwest A&F Univ, Coll Mech & Elect Engn, Yangling 712100, Peoples R China

[2] Monash Univ, Fac Engn, Dept Chem & Biol Engn, Clayton, Vic 3800, Australia

[3] Stanford Univ, Radiat Oncol, Palo Alto, CA 94305 USA

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 249卷

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

Point cloud; Representation learning; Self-supervised learning; Contrastive learning; Few-shot learning;

D O I：

10.1016/j.eswa.2024.123564

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The proliferation of depth cameras and LiDAR sensors in actual industrial environments has fueled the pursuit of an effective and efficient 3D point cloud model that enables us to perceive and interact with the physical world. However, the intrinsic complexity of 3D semantic information poses significant challenges to model design, including spatial rotation invariance and irregular point cloud structure, which fundamentally impact the representation and behavior of 3D point cloud systems. Existing have either heavily relied on labeling information in a supervised learning setting or failed to effectively capture the inherent patterns of the 3D point clouds within a self-supervised learning framework, leading to poor performance in specific downstream tasks. To address these limitations, this paper introduces a self-supervised framework, Dual-Cross Contrastive Neural Network (DCCN) for 3D point cloud self-supervised representation learning. DCCN leverages cross-view, cross-network, and domain-specific knowledge distillation to enhance the extraction of hidden features from point clouds and fully exploit the capabilities of the encoder. Our DCCN employs a pseudo-Siamese network consisting of an online network and a target network, facilitating knowledge interaction and distillation. The method extracts internal states from augmented 3D point cloud by learning cross-view relationships and optimizes model parameters through intra-modal cross-network learning. We incorporate a momentum-updating mechanism without shared weights in the Siamese network architecture to distill knowledge and enhance the role differentiation the online and target networks. Experimental results demonstrate that our approach outperforms a range of supervised and self-supervised learning methods across a series of downstream tasks consisting of four tasks in three representative datasets. Ablation studies validate the component-wise effectiveness of cross-view, cross-network, and moment-updating learning objectives in achieving superior point cloud representation. The overall findings establish our method, DCCN, as an effective solution for 3D point cloud representation learning in real-world applications.

引用

页数：12

共 66 条

[1] CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding [J].

Afham, Mohamed ;

Dissanayake, Isuru ;

Dissanayake, Dinithi ;

Dharmasiri, Amaya ;

Thilakarathna, Kanchana ;

Rodrigo, Ranga .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :9892-9902

[2] A Siamese Neural Network for Learning Semantically-Informed Sentence Embeddings [J].

Bolucu, Necva ;

Can, Burcu ;

Artuner, Harun .

EXPERT SYSTEMS WITH APPLICATIONS, 2023, 214

[3]

Caron M, 2020, ADV NEUR IN, V33

[4] SVDnet: Singular Value Control and Distance Alignment Network for 3D Object Detection [J].

Chang, Ming-Jen ;

Cheng, Chih-Jen ;

Hsiao, Ching-Chun ;

Li, Yung-Hui ;

Huang, Ching-Chun .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (09) :9281-9295

[5]

Chen T, 2020, PR MACH LEARN RES, V119

[6] An Asymmetric Distance Model for Cross-View Feature Mapping in Person Reidentification [J].

Chen, Ying-Cong ;

Zheng, Wei-Shi ;

Lai, Jian-Huang ;

Yuen, Pong C. .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (08) :1661-1675

[7] SC2-PCR++: Rethinking the Generation and Selection for Efficient and Robust Point Cloud Registration [J].

Chen, Zhi ;

Sun, Kun ;

Yang, Fan ;

Guo, Lin ;

Tao, Wenbing .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) :12358-12376

[8] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[9] InOR-Net: Incremental 3-D Object Recognition Network for Point Cloud Representation [J].

Dong, Jiahua ;

Cong, Yang ;

Sun, Gan ;

Wang, Lixu ;

Lyu, Lingjuan ;

Li, Jun ;

Konukoglu, Ender .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (10) :6955-6967

[10] Self-Contrastive Learning with Hard Negative Sampling for Self-supervised Point Cloud Learning [J].

Du, Bi'an ;

Gao, Xiang ;

Hu, Wei ;

Li, Xin .

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, :3133-3142

← 1 2 3 4 5 6 7 →