Point2SpatialCapsule: Aggregating Features and Spatial Relationships of Local Regions on Point Clouds Using Spatial-Aware Capsules

被引:31
作者
Wen, Xin [1 ]
Han, Zhizhong [2 ]
Liu, Xinhai [1 ]
Liu, Yu-Shen [3 ]
机构
[1] Tsinghua Univ, Sch Software, Beijing 100084, Peoples R China
[2] Univ Maryland, Dept Comp Sci, College Pk, MD 20737 USA
[3] Tsinghua Univ, BNRist, Sch Software, Beijing 100084, Peoples R China
关键词
Three-dimensional displays; Feature extraction; Shape; Routing; Aggregates; Machine learning; Spatial resolution; Point cloud; shape representation; feature aggregation; spatial relationships; capsule network; SHAPE; REPRESENTATION; PREDICTION; NETWORK; VIEW;
D O I
10.1109/TIP.2020.3019925
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning discriminative shape representation directly on point clouds is still challenging in 3D shape analysis and understanding. Recent studies usually involve three steps: first splitting a point cloud into some local regions, then extracting the corresponding feature of each local region, and finally aggregating all individual local region features into a global feature as shape representation using simple max-pooling. However, such pooling-based feature aggregation methods do not adequately take the spatial relationships (e.g. the relative locations to other regions) between local regions into account, which greatly limits the ability to learn discriminative shape representation. To address this issue, we propose a novel deep learning network, named Point2SpatialCapsule, for aggregating features and spatial relationships of local regions on point clouds, which aims to learn more discriminative shape representation. Compared with the traditional max-pooling based feature aggregation networks, Point2SpatialCapsule can explicitly learn not only geometric features of local regions but also the spatial relationships among them. Point2SpatialCapsule consists of two main modules. To resolve the disorder problem of local regions, the first module, named geometric feature aggregation, is designed to aggregate the local region features into the learnable cluster centers, which explicitly encodes the spatial locations from the original 3D space. The second module, named spatial relationship aggregation, is proposed for further aggregating the clustered features and the spatial relationships among them in the feature space using the spatial-aware capsules developed in this article. Compared to the previous capsule network based methods, the feature routing on the spatial-aware capsules can learn more discriminative spatial relationships among local regions for point clouds, which establishes a direct mapping between log priors and the spatial locations through feature clusters. Experimental results demonstrate that Point2SpatialCapsule outperforms the state-of-the-art methods in the 3D shape classification, retrieval and segmentation tasks under the well-known ModelNet and ShapeNet datasets.
引用
收藏
页码:8855 / 8869
页数:15
相关论文
共 84 条
[61]   O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis [J].
Wang, Peng-Shuai ;
Liu, Yang ;
Guo, Yu-Xiao ;
Sun, Chun-Yu ;
Tong, Xin .
ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (04)
[62]   Associatively Segmenting Instances and Semantics in Point Clouds [J].
Wang, Xinlong ;
Liu, Shu ;
Shen, Xiaoyong ;
Shen, Chunhua ;
Jia, Jiaya .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4091-4100
[63]   Dynamic Graph CNN for Learning on Point Clouds [J].
Wang, Yue ;
Sun, Yongbin ;
Liu, Ziwei ;
Sarma, Sanjay E. ;
Bronstein, Michael M. ;
Solomon, Justin M. .
ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (05)
[64]  
Wen X., 2020, PROC ACM INT C MULTI
[65]   Point Cloud Completion by Skip-attention Network with Hierarchical Folding [J].
Wen, Xin ;
Li, Tianyang ;
Han, Zhizhong ;
Liu, Yu-Shen .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :1936-1945
[66]  
Wu WK, 2020, AAAI CONF ARTIF INTE, V34, P6422
[67]   PointConv: Deep Convolutional Networks on 3D Point Clouds [J].
Wu, Wenxuan ;
Qi, Zhongang ;
Li Fuxin .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9613-9622
[68]  
Wu ZR, 2015, PROC CVPR IEEE, P1912, DOI 10.1109/CVPR.2015.7298801
[69]  
Xiao LQ, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P4565
[70]   Deep Multimetric Learning for Shape-Based 3D Model Retrieval [J].
Xie, Jin ;
Dai, Guoxian ;
Fang, Yi .
IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (11) :2463-2474