FuseNet: a multi-modal feature fusion network for 3D shape classification

被引：0

作者：

Zhao, Xin ^{[1
]}

Chen, Yinhuang ^{[1
]}

Yang, Chengzhuan ^{[1
]}

Fang, Lincong ^{[2
]}

机构：

[1] Zhejiang Normal Univ, Sch Comp Sci & Technol, 688 Yingbin Rd, Jinhua 321004, Zhejiang, Peoples R China

[2] Zhejiang Univ Finance & Econ, Sch Informat Management & Artifcial Intelligence, 18 Xueyuan St, Hangzhou 310018, Zhejiang, Peoples R China

来源：

VISUAL COMPUTER | 2025年 / 41卷 / 04期

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

3D shape classification; Multi-view; Point cloud; Feature fusion; CONVOLUTIONAL NEURAL-NETWORKS; MODEL;

D O I：

10.1007/s00371-024-03581-2

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Recently, the primary focus of research in 3D shape classification has been on point cloud and multi-view methods. However, the multi-view approaches inevitably lose the structural information of 3D shapes due to the camera angle limitation. The point cloud methods use a neural network to maximize the pooling of all points to obtain a global feature, resulting in the loss of local detailed information. The disadvantages of multi-view and point cloud methods affect the performance of 3D shape classification. This paper proposes a novel FuseNet model, which integrates multi-view and point cloud information and significantly improves the accuracy of 3D model classification. First, we propose a multi-view and point cloud part to obtain the raw features of different convolution layers of multi-view and point clouds. Second, we adopt a multi-view pooling method for feature fusion of multiple views to integrate features of different convolution layers more effectively, and we propose an attention-based multi-view and point cloud fusion block for integrating features of point cloud and multiple views. Finally, we extensively tested our method on three benchmark datasets: the ModelNet10, ModelNet40, and ShapeNet Core55. Our method's experimental results demonstrate superior or comparable classification performance to previously established state-of-the-art techniques for 3D shape classification.

引用

页码：2973 / 2985

页数：13

共 59 条

[1]

[Anonymous], 2015, PROC CVPR IEEE

[2] GIFT: Towards Scalable 3D Shape Retrieval [J].

Bai, Song ;

Bai, Xiang ;

Zhou, Zhichao ;

Zhang, Zhaoxiang ;

Tian, Qi ;

Latecki, Longin Jan .

IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (06) :1257-1271

[3] GIFT: A Real-time and Scalable 3D Shape Search Engine [J].

Bai, Song ;

Bai, Xiang ;

Zhou, Zhichao ;

Zhang, Zhaoxiang ;

Latecki, Longin Jan .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5023-5032

[4] A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection [J].

Cai, Zhaowei ;

Fan, Quanfu ;

Feris, Rogerio S. ;

Vasconcelos, Nuno .

COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :354-370

[5] DDGCN: graph convolution network based on direction and distance for point cloud learning [J].

Chen, Lifang ;

Zhang, Qian .

VISUAL COMPUTER, 2023, 39 (03) :863-873

[6] SliceNet: A proficient model for real-time 3D shape-based recognition [J].

Chen, Xuzhan ;

Chen, Youping ;

Gupta, Kashish ;

Zhou, Jie ;

Najjaran, Homayoun .

NEUROCOMPUTING, 2018, 316 :144-155

[7] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[8] Direction-induced convolution for point cloud analysis [J].

Fang, Yuan ;

Xu, Chunyan ;

Zhou, Chuanwei ;

Cui, Zhen ;

Hu, Chunlong .

MULTIMEDIA SYSTEMS, 2022, 28 (02) :457-468

[9]

Feng YX, 2018, 2018 3RD IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION ENGINEERING (ICITE), P264, DOI 10.1109/ICITE.2018.8492700

[10]

Furuya T., 2016, BMVC, V7, P8

← 1 2 3 4 5 6 →