EFECL: Feature encoding enhancement with contrastive learning for indoor 3D object detection

被引：1

作者：

Duan, Yao ^{[1
]}

Yi, Renjiao ^{[1
]}

Gao, Yuanming ^{[1
]}

Xu, Kai ^{[1
]}

Zhu, Chenyang ^{[1
]}

机构：

[1] Natl Univ Def Technol, Sch Comp, Changsha 410000, Peoples R China

来源：

COMPUTATIONAL VISUAL MEDIA | 2023年 / 9卷 / 04期

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

indoor scene; object detection; contrastive learning; feature enhancement;

D O I：

10.1007/s41095-023-0366-0

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Good proposal initials are critical for 3D object detection applications. However, due to the significant geometry variation of indoor scenes, incomplete and noisy proposals are inevitable in most cases. Mining feature information among these "bad" proposals may mislead the detection. Contrastive learning provides a feasible way for representing proposals, which can align complete and incomplete/noisy proposals in feature space. The aligned feature space can help us build robust 3D representation even if bad proposals are given. Therefore, we devise a new contrast learning framework for indoor 3D object detection, called EFECL, that learns robust 3D representations by contrastive learning of proposals on two different levels. Specifically, we optimize both instance-level and category-level contrasts to align features by capturing instance-specific characteristics and semantic-aware common patterns. Furthermore, we propose an enhanced feature aggregation module to extract more general and informative features for contrastive learning. Evaluations on ScanNet V2 and SUN RGB-D benchmarks demonstrate the generalizability and effectiveness of our method, and our method can achieve 12.3% and 7.3% improvements on both datasets over the benchmark alternatives. The code andmodels are publicly available at https://github.com/YaraDuan/EFECL.

引用

页码：875 / 892

页数：18

共 53 条

[1] CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding [J].

Afham, Mohamed ;

Dissanayake, Isuru ;

Dissanayake, Dinithi ;

Dharmasiri, Amaya ;

Thilakarathna, Kanchana ;

Rodrigo, Ranga .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :9892-9902

[2] Point-Level Region Contrast for Object Detection Pre-Training [J].

Bai, Yutong ;

Chen, Xinlei ;

Kirillov, Alexander ;

Yuille, Alan ;

Berg, Alexander C. .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :16040-16049

[3] A Hierarchical Graph Network for 3D Object Detection on Point Clouds [J].

Chen, Jintai ;

Lei, Biwen ;

Song, Qingyu ;

Ying, Haochao ;

Chen, Danny Z. ;

Wu, Jian .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :389-398

[4] Shot Contrastive Self-Supervised Learning for Scene Boundary Detection [J].

Chen, Shixing ;

Nie, Xiaohan ;

Fan, David ;

Zhang, Dongqing ;

Bhat, Vimal ;

Hamid, Raffay .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :9791-9800

[5]

Chen Ting, 2019, PMLR

[6] Multi-View 3D Object Detection Network for Autonomous Driving [J].

Chen, Xiaozhi ;

Ma, Huimin ;

Wan, Ji ;

Li, Bo ;

Xia, Tian .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534

[7] Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds [J].

Cheng, Bowen ;

Sheng, Lu ;

Shi, Shaoshuai ;

Yang, Ming ;

Xu, Dong .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :8959-8968

[8] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[9] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes [J].

Dai, Angela ;

Chang, Angel X. ;

Savva, Manolis ;

Halber, Maciej ;

Funkhouser, Thomas ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2432-2443

[10] DisARM: Displacement Aware Relation Module for 3D Detection [J].

Duan, Yao ;

Zhu, Chenyang ;

Lan, Yuqing ;

Yi, Renjiao ;

Liu, Xinwang ;

Xu, Kai .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :16959-16968

← 1 2 3 4 5 6 →