FuseNet: a multi-modal feature fusion network for 3D shape classification

被引:0
作者
Zhao, Xin [1 ]
Chen, Yinhuang [1 ]
Yang, Chengzhuan [1 ]
Fang, Lincong [2 ]
机构
[1] Zhejiang Normal Univ, Sch Comp Sci & Technol, 688 Yingbin Rd, Jinhua 321004, Zhejiang, Peoples R China
[2] Zhejiang Univ Finance & Econ, Sch Informat Management & Artifcial Intelligence, 18 Xueyuan St, Hangzhou 310018, Zhejiang, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
3D shape classification; Multi-view; Point cloud; Feature fusion; CONVOLUTIONAL NEURAL-NETWORKS; MODEL;
D O I
10.1007/s00371-024-03581-2
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recently, the primary focus of research in 3D shape classification has been on point cloud and multi-view methods. However, the multi-view approaches inevitably lose the structural information of 3D shapes due to the camera angle limitation. The point cloud methods use a neural network to maximize the pooling of all points to obtain a global feature, resulting in the loss of local detailed information. The disadvantages of multi-view and point cloud methods affect the performance of 3D shape classification. This paper proposes a novel FuseNet model, which integrates multi-view and point cloud information and significantly improves the accuracy of 3D model classification. First, we propose a multi-view and point cloud part to obtain the raw features of different convolution layers of multi-view and point clouds. Second, we adopt a multi-view pooling method for feature fusion of multiple views to integrate features of different convolution layers more effectively, and we propose an attention-based multi-view and point cloud fusion block for integrating features of point cloud and multiple views. Finally, we extensively tested our method on three benchmark datasets: the ModelNet10, ModelNet40, and ShapeNet Core55. Our method's experimental results demonstrate superior or comparable classification performance to previously established state-of-the-art techniques for 3D shape classification.
引用
收藏
页码:2973 / 2985
页数:13
相关论文
共 59 条
[1]  
[Anonymous], 2015, PROC CVPR IEEE
[2]   GIFT: Towards Scalable 3D Shape Retrieval [J].
Bai, Song ;
Bai, Xiang ;
Zhou, Zhichao ;
Zhang, Zhaoxiang ;
Tian, Qi ;
Latecki, Longin Jan .
IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (06) :1257-1271
[3]   GIFT: A Real-time and Scalable 3D Shape Search Engine [J].
Bai, Song ;
Bai, Xiang ;
Zhou, Zhichao ;
Zhang, Zhaoxiang ;
Latecki, Longin Jan .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5023-5032
[4]   A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection [J].
Cai, Zhaowei ;
Fan, Quanfu ;
Feris, Rogerio S. ;
Vasconcelos, Nuno .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :354-370
[5]   DDGCN: graph convolution network based on direction and distance for point cloud learning [J].
Chen, Lifang ;
Zhang, Qian .
VISUAL COMPUTER, 2023, 39 (03) :863-873
[6]   SliceNet: A proficient model for real-time 3D shape-based recognition [J].
Chen, Xuzhan ;
Chen, Youping ;
Gupta, Kashish ;
Zhou, Jie ;
Najjaran, Homayoun .
NEUROCOMPUTING, 2018, 316 :144-155
[7]   Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].
Dai, Angela ;
Qi, Charles Ruizhongtai ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554
[8]   Direction-induced convolution for point cloud analysis [J].
Fang, Yuan ;
Xu, Chunyan ;
Zhou, Chuanwei ;
Cui, Zhen ;
Hu, Chunlong .
MULTIMEDIA SYSTEMS, 2022, 28 (02) :457-468
[9]  
Feng YX, 2018, 2018 3RD IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION ENGINEERING (ICITE), P264, DOI 10.1109/ICITE.2018.8492700
[10]  
Furuya T., 2016, BMVC, V7, P8