DANCE-NET: Density-aware convolution networks with context encoding for airborne LiDAR point cloud classification

被引:50
作者
Li, Xiang [1 ,2 ,3 ,4 ]
Wang, Lingjing [1 ,2 ,3 ,4 ]
Wang, Mingyang [1 ,2 ,3 ]
Wen, Congcong [2 ]
Fang, Yi [1 ,2 ,3 ,4 ]
机构
[1] NYU Tandon, NYU Multimedia & Visual Comp Lab, Brooklyn, NY USA
[2] NYU Abu Dhabi, NYU Multimedia & Visual Comp Lab, Abu Dhabi, U Arab Emirates
[3] NYU, Tandon Sch Engn, New York, NY USA
[4] NYU Abu Dhabi, Dept Elect & Comp Engn, Abu Dhabi, U Arab Emirates
关键词
Airborne LiDAR; Point cloud classification; Density-aware convolution; Context encoding; SUPPORT VECTOR MACHINE; NEURAL-NETWORKS; DATA FUSION;
D O I
10.1016/j.isprsjprs.2020.05.023
中图分类号
P9 [自然地理学];
学科分类号
0705 ; 070501 ;
摘要
Airborne LiDAR point cloud classification has been a long-standing problem in photogrammetry and remote sensing. Early efforts either combine hand-crafted feature engineering with machine learning-based classification models or leverage the power of conventional convolutional neural networks (CNNs) on projected feature images. Recent proposed deep learning-based methods tend to develop new convolution operators which can be directly applied on raw point clouds for representative point feature learning. Although these methods have achieved satisfying performance for the classification of airborne LiDAR point clouds, they cannot adequately recognize fine-grained local structures due to the uneven density distribution of 3D point clouds. In this paper, to address this challenging issue, we introduce a density-aware convolution module which uses the point-wise density to reweight the learnable weights of convolution kernels. The proposed convolution module can approximate continuous convolution on unevenly distributed 3D point sets. Based on this convolution module, we further develop a multi-scale CNN model with downsampling and upsampling blocks to perform per-point semantic labeling. In addition, to regularize the global semantic context, we implement a context encoding module to predict a global context encoding and formulated a context encoding regularizer to enforce the predicted context encoding to be aligned with the ground truth one. The overall network can be trained in an end-to-end fashion and directly produces the desired classification results in one network forward pass. Experiments on the ISPRS 3D Labeling Dataset and 2019 Data Fusion Contest Dataset demonstrate the effectiveness and superiority of the proposed method for airborne LiDAR point cloud classification.
引用
收藏
页码:128 / 139
页数:12
相关论文
共 53 条
[1]  
[Anonymous], 2009, Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci
[2]  
[Anonymous], 2012, 22 ISPRS C TECHN COM
[3]  
[Anonymous], 2013, 2013 JOINT URB REM
[4]  
Arief H.A., 2019, ARXIV190203088
[5]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[6]  
Batista G. E. A. P. A., 2004, ACM SIGKDD Explor Newsl, V6, P20, DOI [10.1145/1007730.1007735, DOI 10.1145/1007730.1007735]
[7]   Pointwise Convolutional Neural Networks [J].
Binh-Son Hua ;
Minh-Khoi Tran ;
Yeung, Sai-Kit .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :984-993
[8]   Semantic Stereo for Incidental Satellite Images [J].
Bosch, Marc ;
Foster, Kevin ;
Christie, Gordon ;
Wang, Sean ;
Hager, Gregory D. ;
Brown, Myron .
2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, :1524-1532
[9]  
Chehata N., 2009, ISPRS Arch. Photogramm. Remote Sens. Spat. Inf. Sci, V38, pW8
[10]   Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images [J].
Cheng, Gong ;
Zhou, Peicheng ;
Han, Junwei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (12) :7405-7415