FuseNet: a multi-modal feature fusion network for 3D shape classification

被引:0
作者
Zhao, Xin [1 ]
Chen, Yinhuang [1 ]
Yang, Chengzhuan [1 ]
Fang, Lincong [2 ]
机构
[1] Zhejiang Normal Univ, Sch Comp Sci & Technol, 688 Yingbin Rd, Jinhua 321004, Zhejiang, Peoples R China
[2] Zhejiang Univ Finance & Econ, Sch Informat Management & Artifcial Intelligence, 18 Xueyuan St, Hangzhou 310018, Zhejiang, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
3D shape classification; Multi-view; Point cloud; Feature fusion; CONVOLUTIONAL NEURAL-NETWORKS; MODEL;
D O I
10.1007/s00371-024-03581-2
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recently, the primary focus of research in 3D shape classification has been on point cloud and multi-view methods. However, the multi-view approaches inevitably lose the structural information of 3D shapes due to the camera angle limitation. The point cloud methods use a neural network to maximize the pooling of all points to obtain a global feature, resulting in the loss of local detailed information. The disadvantages of multi-view and point cloud methods affect the performance of 3D shape classification. This paper proposes a novel FuseNet model, which integrates multi-view and point cloud information and significantly improves the accuracy of 3D model classification. First, we propose a multi-view and point cloud part to obtain the raw features of different convolution layers of multi-view and point clouds. Second, we adopt a multi-view pooling method for feature fusion of multiple views to integrate features of different convolution layers more effectively, and we propose an attention-based multi-view and point cloud fusion block for integrating features of point cloud and multiple views. Finally, we extensively tested our method on three benchmark datasets: the ModelNet10, ModelNet40, and ShapeNet Core55. Our method's experimental results demonstrate superior or comparable classification performance to previously established state-of-the-art techniques for 3D shape classification.
引用
收藏
页码:2973 / 2985
页数:13
相关论文
共 59 条
[21]   RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints [J].
Kanezaki, Asako ;
Matsushita, Yasuyuki ;
Nishida, Yoshifumi .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5010-5019
[22]   Unsupervised Primitive Discovery for Improved 3D Generative Modeling [J].
Khan, Salman H. ;
Guo, Yulan ;
Hayat, Munawar ;
Barnes, Nick .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9731-9740
[23]   Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models [J].
Klokov, Roman ;
Lempitsky, Victor .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :863-872
[24]   LP-3DCNN: Unveiling Local Phase in 3D Convolutional Neural Networks [J].
Kumawat, Sudhakar ;
Raman, Shanmuganathan .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4898-4907
[25]   3D model retrieval using hybrid features and class information [J].
Li, Bo ;
Johan, Henry .
MULTIMEDIA TOOLS AND APPLICATIONS, 2013, 62 (03) :821-846
[26]   Class-aware tiny object recognition over large-scale 3D point clouds [J].
Li, Jialin ;
Saydam, Sarp ;
Xu, Yuanyuan ;
Liu, Boge ;
Li, Binghao ;
Lin, Xuemin ;
Zhang, Wenjie .
NEUROCOMPUTING, 2023, 529 :166-181
[27]  
Li Y, 2018, ADV NEUR IN, V31
[28]   Feature Pyramid Networks for Object Detection [J].
Lin, Tsung-Yi ;
Dollar, Piotr ;
Girshick, Ross ;
He, Kaiming ;
Hariharan, Bharath ;
Belongie, Serge .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :936-944
[29]   Prototype-based semantic consistency learning for unsupervised 2D image-based 3D shape retrieval [J].
Liu, An-An ;
Zhang, Yuwei ;
Zhang, Chenyu ;
Li, Wenhui ;
Lv, Bo ;
Lei, Lei ;
Li, Xuanya .
MULTIMEDIA SYSTEMS, 2023, 29 (04) :1995-2007
[30]   Deep 3D point cloud classification and segmentation network based on GateNet [J].
Liu, Hui ;
Tian, Shuaihua .
VISUAL COMPUTER, 2024, 40 (02) :971-981