PointSmile: point self-supervised learning via curriculum mutual information

被引：1

作者：

Li, Xin ^{[1
,2
]}

Wei, Mingqiang ^{[1
,2
]}

Chen, Songcan ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Sch Comp Sci & Technol, Nanjing 210016, Peoples R China

[2] Nanjing Univ Aeronaut & Astronaut, Shenzhen Res Inst, Shenzhen 518063, Peoples R China

来源：

SCIENCE CHINA-INFORMATION SCIENCES | 2024年 / 67卷 / 11期

基金：

中国国家自然科学基金;

关键词：

PointSmile; self-supervised learning; curriculum mutual information; point cloud; representation learning;

D O I：

10.1007/s11432-023-4085-9

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Self-supervised learning is attracting significant attention from researchers in the point cloud processing field. However, due to the natural sparsity and irregularity of point clouds, effectively extracting discriminative and transferable features for efficient training on downstream tasks remains an unsolved challenge. Consequently, we propose PointSmile, a reconstruction-free self-supervised learning paradigm by maximizing curriculum mutual information (CMI) across the replicas of point cloud objects. From the perspective of how-and-what-to-learn, PointSmile is designed to imitate human curriculum learning, i.e., starting with easier topics in a curriculum and gradually progressing to learning more complex topics in the curriculum. To solve "how-to-learn", we introduce curriculum data augmentation (CDA) of point clouds. CDA encourages PointSmile to follow a learning path that starts from learning easy data samples and progresses to learning hard data samples, such that the latent space can be dynamically affected to create better embeddings. To solve "what-to-learn", we propose maximizing both feature- and class-wise CMI to better extract discriminative features of point clouds. Unlike most existing methods, PointSmile does not require a pretext task or cross-modal data to yield rich latent representations; additionally, it can be easily transferred to various backbones. We demonstrate the effectiveness and robustness of PointSmile in downstream tasks such as object classification and segmentation. The study results show that PointSmile outperforms existing self-supervised methods and compares favorably with popular fully supervised methods on various standard architectures. The code is available at https://github.com/theaalee/PointSmile.

引用

页数：15

共 73 条

[1]

Achlioptas P, 2018, PR MACH LEARN RES, V80

[2] CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding [J].

Afham, Mohamed ;

Dissanayake, Isuru ;

Dissanayake, Dinithi ;

Dharmasiri, Amaya ;

Thilakarathna, Kanchana ;

Rodrigo, Ranga .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :9892-9902

[3] 3D Semantic Parsing of Large-Scale Indoor Spaces [J].

Armeni, Iro ;

Sener, Ozan ;

Zamir, Amir R. ;

Jiang, Helen ;

Brilakis, Ioannis ;

Fischer, Martin ;

Savarese, Silvio .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1534-1543

[4] Teaching by examples: Implications for the process of category acquisition [J].

Avrahami, J ;

Kareev, Y ;

Bogot, Y ;

Caspi, R ;

Dunaevsky, S ;

Lerner, S .

QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY SECTION A-HUMAN EXPERIMENTAL PSYCHOLOGY, 1997, 50 (03) :586-606

[5]

Belghazi MI, 2018, PR MACH LEARN RES, V80

[6] Deep Clustering for Unsupervised Learning of Visual Features [J].

Caron, Mathilde ;

Bojanowski, Piotr ;

Joulin, Armand ;

Douze, Matthijs .

COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :139-156

[7]

Chang Angel X., 2015, arXiv

[8]

Chen H., 2022, P ACM SIGGRAPH

[9] Multi-Patch Collaborative Point Cloud Denoising via Low-Rank Recovery with Graph Constraint [J].

Chen, Honghua ;

Wei, Mingqiang ;

Sun, Yangxing ;

Xie, Xingyu ;

Wang, Jun .

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2020, 26 (11) :3255-3270

[10]

Chen XL, 2020, Arxiv, DOI arXiv:2003.04297

← 1 2 3 4 5 6 7 8 →