Multi-View CNN Feature Aggregation with ELM Auto-Encoder for 3D Shape Recognition

被引：0

作者：

Zhi-Xin Yang

Lulu Tang

Kun Zhang

Pak Kin Wong

机构：

[1] University of Macau,Department of Electromechanical Engineering, Faculty of Science and Technology

来源：

Cognitive Computation | 2018年 / 10卷

关键词：

ELM auto-encoder; Convolutional neural networks; 3D shape recognition; Multi-view feature aggregation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Fast and accurate detection of 3D shapes is a fundamental task of robotic systems for intelligent tracking and automatic control. View-based 3D shape recognition has attracted increasing attention because human perceptions of 3D objects mainly rely on multiple 2D observations from different viewpoints. However, most existing multi-view-based cognitive computation methods use straightforward pairwise comparisons among the projected images then follow with weak aggregation mechanism, which results in heavy computation cost and low recognition accuracy. To address such problems, a novel network structure combining multi-view convolutional neural networks (M-CNNs), extreme learning machine auto-encoder (ELM-AE), and ELM classifer, named as MCEA, is proposed for comprehensive feature learning, effective feature aggregation, and efficient classification of 3D shapes. Such novel framework exploits the advantages of deep CNN architecture with the robust ELM-AE feature representation, as well as the fast ELM classifier for 3D model recognition. Compared with the existing set-to-set image comparison methods, the proposed shape-to-shape matching strategy could convert each high informative 3D model into a single compact feature descriptor via cognitive computation. Moreover, the proposed method runs much faster and obtains a good balance between classification accuracy and computational efficiency. Experimental results on the benchmarking Princeton ModelNet, ShapeNet Core 55, and PSB datasets show that the proposed framework achieves higher classification and retrieval accuracy in much shorter time than the state-of-the-art methods.

引用

页码：908 / 921

页数：13

共 32 条

[21] FuseNet: a multi-modal feature fusion network for 3D shape classification
Zhao, Xin
Chen, Yinhuang
Yang, Chengzhuan
Fang, Lincong
VISUAL COMPUTER, 2025, 41 (04) : 2973 - 2985
[22] Learning high-level features by fusing multi-view representation of MLS point clouds for 3D object recognition in road environments
Luo, Zhipeng
Li, Jonathan
Xiao, Zhenlong
Mou, Z. Geroge
Cai, Xiaojie
Wang, Cheng
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2019, 150 : 44 - 58
[23] Multi-view 3D model retrieval based on enhanced detail features with contrastive center loss
Chen, Qiang
Chen, Yinong
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (08) : 10407 - 10426
[24] Multi-view self-supervised learning for 3D facial texture reconstruction from single image
Zeng, Xiaoxing
Hu, Ruyun
Shi, Wu
Qiao, Yu
IMAGE AND VISION COMPUTING, 2021, 115
[25] Robust 3D Hand Pose Estimation From Single Depth Images Using Multi-View CNNs
Ge, Liuhao
Liang, Hui
Yuan, Junsong
Thalmann, Daniel
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (09) : 4422 - 4436
[26] SEMI-SUPERVISED LEARNING OF MONOCULAR 3D HAND POSE ESTIMATION FROM MULTI-VIEW IMAGES
Mueller, Markus
Poier, Georg
Possegger, Horst
Bischof, Horst
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1104 - 1108
[27] Deep learning-based multi-view 3D-human action recognition using skeleton and depth data
Sampat Kumar Ghosh
Rashmi M
Biju R Mohan
Ram Mohana Reddy Guddeti
Multimedia Tools and Applications, 2023, 82 : 19829 - 19851
[28] A deep learning-based multi-view approach to automatic 3D landmarking and deformity assessment of lower limb
Rostamian, Reyhaneh
Panahi, Masoud Shariat
Karimpour, Morad
Kashani, Hadi G.
Abi, Amirhossein
SCIENTIFIC REPORTS, 2025, 15 (01):
[29] Deep learning-based multi-view 3D-human action recognition using skeleton and depth data
Ghosh, Sampat Kumar
Rashmi, M.
Mohan, Biju R.
Guddeti, Ram Mohana Reddy
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (13) : 19829 - 19851
[30] No-Reference 3D Point Cloud Quality Assessment Using Multi-View Projection and Deep Convolutional Neural Network
Bourbia, Salima
Karine, Ayoub
Chetouani, Aladine
El Hassouni, Mohammed
Jridi, Maher
IEEE ACCESS, 2023, 11 : 26759 - 26772

← 1 2 3 4 →