Multi-View CNN Feature Aggregation with ELM Auto-Encoder for 3D Shape Recognition

被引：0

作者：

Zhi-Xin Yang

Lulu Tang

Kun Zhang

Pak Kin Wong

机构：

[1] University of Macau,Department of Electromechanical Engineering, Faculty of Science and Technology

来源：

Cognitive Computation | 2018年 / 10卷

关键词：

ELM auto-encoder; Convolutional neural networks; 3D shape recognition; Multi-view feature aggregation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Fast and accurate detection of 3D shapes is a fundamental task of robotic systems for intelligent tracking and automatic control. View-based 3D shape recognition has attracted increasing attention because human perceptions of 3D objects mainly rely on multiple 2D observations from different viewpoints. However, most existing multi-view-based cognitive computation methods use straightforward pairwise comparisons among the projected images then follow with weak aggregation mechanism, which results in heavy computation cost and low recognition accuracy. To address such problems, a novel network structure combining multi-view convolutional neural networks (M-CNNs), extreme learning machine auto-encoder (ELM-AE), and ELM classifer, named as MCEA, is proposed for comprehensive feature learning, effective feature aggregation, and efficient classification of 3D shapes. Such novel framework exploits the advantages of deep CNN architecture with the robust ELM-AE feature representation, as well as the fast ELM classifier for 3D model recognition. Compared with the existing set-to-set image comparison methods, the proposed shape-to-shape matching strategy could convert each high informative 3D model into a single compact feature descriptor via cognitive computation. Moreover, the proposed method runs much faster and obtains a good balance between classification accuracy and computational efficiency. Experimental results on the benchmarking Princeton ModelNet, ShapeNet Core 55, and PSB datasets show that the proposed framework achieves higher classification and retrieval accuracy in much shorter time than the state-of-the-art methods.

引用

页码：908 / 921

页数：13

共 32 条

[1] Multi-View CNN Feature Aggregation with ELM Auto-Encoder for 3D Shape Recognition
Yang, Zhi-Xin
Tang, Lulu
Zhang, Kun
Wong, Pak Kin
COGNITIVE COMPUTATION, 2018, 10 (06) : 908 - 921
[2] MVPN: Multi-View Prototype Network for 3D Shape Recognition
Wu, Zizhao
Yang, Ping
Wang, Yigang
IEEE ACCESS, 2019, 7 : 130363 - 130372
[3] Multi-View 3D Shape Recognition via Correspondence-Aware Deep Learning
Xu, Yong
Zheng, Chaoda
Xu, Ruotao
Quan, Yuhui
Ling, Haibin
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 5299 - 5312
[4] Multi-Scale Multi-View Deep Feature Aggregation for Food Recognition
Jiang, Shuqiang
Min, Weiqing
Liu, Linhu
Luo, Zhengdong
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 265 - 276
[5] Multi-view convolutional vision transformer for 3D object recognition
Li, Jie
Liu, Zhao
Li, Li
Lin, Junqin
Yao, Jian
Tu, Jingmin
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 95
[6] MULTI-VIEW GAIT RECOGNITION USING 3D CONVOLUTIONAL NEURAL NETWORKS
Wolf, Thomas
Babaee, Mohammadreza
Rigoll, Gerhard
2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 4165 - 4169
[7] Review of multi-view 3D object recognition methods based on deep learning
Qi, Shaohua
Ning, Xin
Yang, Guowei
Zhang, Liping
Long, Peng
Cai, Weiwei
Li, Weijun
DISPLAYS, 2021, 69
[8] iMVS: Integrating multi-view information on multiple scales for 3D object recognition ☆
Jiang, Jiaqin
Liu, Zhao
Li, Jie
Tu, Jingmin
Li, Li
Yao, Jian
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 100
[9] Group-pair deep feature learning for multi-view 3d model retrieval
Chen, Xiuxiu
Liu, Li
Zhang, Long
Zhang, Huaxiang
Meng, Lili
Liu, Dongmei
APPLIED INTELLIGENCE, 2022, 52 (02) : 2013 - 2022
[10] Self-supervised Multi-view Learning via Auto-encoding 3D Transformations
Gao, Xiang
Hu, Wei
Qi, Guo-Jun
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (01)

← 1 2 3 4 →