Semantic segmentation of mobile mapping point clouds via multi-view label transfer

被引:4
|
作者
Peters, Torben [1 ]
Brenner, Claus [2 ]
Schindler, Konrad [1 ]
机构
[1] Swiss Fed Inst Technol, Photogrammetry & Remote Sensing, CH-8093 Zurich, Switzerland
[2] Leibniz Univ Hannover, Inst Cartog & Geoinformat, D-30167 Hannover, Germany
关键词
Semantic segmentation; 3D point clouds; Multi-view; Convolutional neural network (CNN); Label transfer; CONVOLUTIONAL NEURAL-NETWORKS; 3D; VISION; IMAGES; FUSION;
D O I
10.1016/j.isprsjprs.2023.05.018
中图分类号
P9 [自然地理学];
学科分类号
0705 ; 070501 ;
摘要
We study how to learn semantic segmentation of 3D point clouds from small training sets. The problem arises because annotating 3D point clouds is a lot more time-consuming and error-prone than annotating 2D images. On the one hand this means that one cannot afford to create a large enough training dataset for each new project. On the other hand it also means that there is not nearly as much public data available as there is for images, which one could use to pretrain a generic feature extractor that could then, with only little dedicated training data, be adapted ("fine-tuned") to the task at hand. To address this bottleneck we explore the possibility to transfer knowledge from the 2D image domain to 3D point clouds. That strategy is of particular interest for mobile mapping systems that capture both point clouds and images, in a fully calibrated setting that makes it easy to connect the two domains. We find that, as expected, naively segmenting in image space and mapping the resulting labels onto the point cloud is not sufficient, as visual ambiguities, residual calibration errors, etc. affect the result. Instead, we propose a system that learns to merge image evidence from a varying number viewpoint, and 3D geometry information, into a common representation that encodes point-wise 3D semantics. To validate our approach we make use of a new mobile mapping dataset with 88M annotated 3D points and 2205 oriented multi-view images. In a series of experiments, we show how much label noise is caused by simplistic label transfer, and how well existing semantic segmentation architectures can correct it. Finally, we demonstrate that adding our learned 2D-to-3D multi-view label transfer significantly improves the performance of different segmentation backbones.
引用
收藏
页码:30 / 39
页数:10
相关论文
共 50 条
  • [1] Multi-View Incremental Segmentation of 3-D Point Clouds for Mobile Robots
    Chen, Jingdao
    Cho, Yong Kwon
    Kira, Zsolt
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (02) : 1240 - 1246
  • [2] Multi-view Network with Transformer for Point Cloud Semantic Segmentation
    Hua, Zhongwei
    Du, Daming
    6TH INTERNATIONAL CONFERENCE ON INNOVATION IN ARTIFICIAL INTELLIGENCE, ICIAI2022, 2022, : 161 - 165
  • [3] Multi-View Radar Semantic Segmentation
    Ouaknine, Arthur
    Newson, Alasdair
    Perez, Patrick
    Tupin, Florence
    Rebut, Julien
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15651 - 15660
  • [4] Incomplete multi-view clustering via deep semantic mapping
    Zhao, Liang
    Chen, Zhikui
    Yang, Yi
    Wang, Z. Jane
    Leung, Victor C. M.
    NEUROCOMPUTING, 2018, 275 : 1053 - 1062
  • [5] Multi-view 3D Entangled Forest For Semantic Segmentation and Mapping
    Antonello, Morris
    Wolf, Daniel
    Prankl, Johann
    Ghidoni, Stefano
    Menegatti, Emanuele
    Vincze, Markus
    2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 1855 - 1862
  • [6] MULTI-VIEW SEMANTIC TEMPORAL VIDEO SEGMENTATION
    Theodoridis, Thomas
    Tefas, Anastasios
    Pitas, Ioannis
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 3947 - 3951
  • [7] Noise-resistant Unsupervised Object Segmentation in Multi-view Indoor Point Clouds
    Bobkov, Dmytro
    Chen, Sili
    Kiechle, Martin
    Hilsenbeck, Sebastian
    Steinbach, Eckehard
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2017), VOL 5, 2017, : 149 - 156
  • [8] Multi-view knowledge distillation for efficient semantic segmentation
    Wang, Chen
    Zhong, Jiang
    Dai, Qizhu
    Qi, Yafei
    Shi, Fengyuan
    Fang, Bin
    Li, Xue
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2023, 20 (02)
  • [9] Multi-view knowledge distillation for efficient semantic segmentation
    Chen Wang
    Jiang Zhong
    Qizhu Dai
    Yafei Qi
    Fengyuan Shi
    Bin Fang
    Xue Li
    Journal of Real-Time Image Processing, 2023, 20
  • [10] Learning Where to Classify in Multi-view Semantic Segmentation
    Riemenschneider, Hayko
    Bodis-Szomoru, Andras
    Weissenberg, Julien
    Van Gool, Luc
    COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 : 516 - 532