Deep learning based 3D segmentation in computer vision: A survey

被引:8
作者
He, Yong [1 ]
Yu, Hongshan [1 ]
Liu, Xiaoyan [1 ]
Yang, Zhengeng [2 ]
Sun, Wei [1 ]
Anwar, Saeed [3 ]
Mian, Ajmal [4 ]
机构
[1] Hunan Univ, Quanzhou Inst Ind Design & Machine Intelligence In, Coll Elect & Informat Engn, Sch Robot, Lushan South Rd, Changsha 410082, Hunan, Peoples R China
[2] Hunan Normal Univ, Lushan South Rd, Changsha 410081, Hunan, Peoples R China
[3] Australian Natl Univ, Canberra, ACT 2600, Australia
[4] Univ Western Australia, 35 Stirling Hwy, Perth, WA 6009, Australia
基金
澳大利亚研究理事会; 中国国家自然科学基金;
关键词
Computer vision; Deep learning; Deep neural network; 3D semantic segmentation; 3D instance segmentation; 3D part segmentation; SEMANTIC SEGMENTATION; POINT; NETWORKS; LIDAR;
D O I
10.1016/j.inffus.2024.102722
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D segmentation is a fundamental and challenging problem in computer vision with applications in autonomous driving and robotics. It has received significant attention from the computer vision, graphics and machine learning communities. Conventional methods for 3D segmentation, based on hand-crafted features and machine learning classifiers, lack generalization ability. Driven by their success in 2D computer vision, deep learning techniques have recently become the tool of choice for 3D segmentation tasks. This has led to an influx of many methods in the literature that have been evaluated on different benchmark datasets. Whereas survey papers on RGB-D and point cloud segmentation exist, there is a lack of a recent in-depth survey that covers all 3D data modalities and application domains. This paper fills the gap and comprehensively surveys the recent progress in deep learning-based 3D segmentation techniques. We cover over 230 works from the last six years, analyze their strengths and limitations, and discuss their competitive results on benchmark datasets. The survey provides a summary of the most commonly used pipelines and finally highlights promising research directions for the future.
引用
收藏
页数:24
相关论文
共 234 条
[21]   4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks [J].
Choy, Christopher ;
Gwak, JunYoung ;
Savarese, Silvio .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3070-3079
[22]  
Couprie C, 2013, Arxiv, DOI arXiv:1301.3572
[23]   3DMV: Joint 3D-Multi-view Prediction for 3D Semantic Scene Segmentation [J].
Dai, Angela ;
Niessner, Matthias .
COMPUTER VISION - ECCV 2018, PT X, 2018, 11214 :458-474
[24]   ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans [J].
Dai, Angela ;
Ritchie, Daniel ;
Bokeloh, Martin ;
Reed, Scott ;
Sturm, Juergen ;
Niessner, Matthias .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4578-4587
[25]   ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes [J].
Dai, Angela ;
Chang, Angel X. ;
Savva, Manolis ;
Halber, Maciej ;
Funkhouser, Thomas ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2432-2443
[26]   PointVector: A Vector Representation In Point Cloud Analysis [J].
Deng, Xin ;
Zhang, Wenyu ;
Ding, Qing ;
Zhang, XinMing .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :9455-9465
[27]  
Duan L., 2024, Adv. Neural Inf. Process. Syst., V36
[28]   3D Bird's-Eye-View Instance Segmentation [J].
Elich, Cathrin ;
Engelmann, Francis ;
Kontogianni, Theodora ;
Leibe, Bastian .
PATTERN RECOGNITION, DAGM GCPR 2019, 2019, 11824 :48-61
[29]  
Engelmann F., 2018, P EUR C COMP VIS
[30]  
Engelmann F, 2020, IEEE INT CONF ROBOT, P9463, DOI [10.1109/icra40945.2020.9197503, 10.1109/ICRA40945.2020.9197503]