Multi-View CNN Feature Aggregation with ELM Auto-Encoder for 3D Shape Recognition

被引:0
|
作者
Zhi-Xin Yang
Lulu Tang
Kun Zhang
Pak Kin Wong
机构
[1] University of Macau,Department of Electromechanical Engineering, Faculty of Science and Technology
来源
Cognitive Computation | 2018年 / 10卷
关键词
ELM auto-encoder; Convolutional neural networks; 3D shape recognition; Multi-view feature aggregation;
D O I
暂无
中图分类号
学科分类号
摘要
Fast and accurate detection of 3D shapes is a fundamental task of robotic systems for intelligent tracking and automatic control. View-based 3D shape recognition has attracted increasing attention because human perceptions of 3D objects mainly rely on multiple 2D observations from different viewpoints. However, most existing multi-view-based cognitive computation methods use straightforward pairwise comparisons among the projected images then follow with weak aggregation mechanism, which results in heavy computation cost and low recognition accuracy. To address such problems, a novel network structure combining multi-view convolutional neural networks (M-CNNs), extreme learning machine auto-encoder (ELM-AE), and ELM classifer, named as MCEA, is proposed for comprehensive feature learning, effective feature aggregation, and efficient classification of 3D shapes. Such novel framework exploits the advantages of deep CNN architecture with the robust ELM-AE feature representation, as well as the fast ELM classifier for 3D model recognition. Compared with the existing set-to-set image comparison methods, the proposed shape-to-shape matching strategy could convert each high informative 3D model into a single compact feature descriptor via cognitive computation. Moreover, the proposed method runs much faster and obtains a good balance between classification accuracy and computational efficiency. Experimental results on the benchmarking Princeton ModelNet, ShapeNet Core 55, and PSB datasets show that the proposed framework achieves higher classification and retrieval accuracy in much shorter time than the state-of-the-art methods.
引用
收藏
页码:908 / 921
页数:13
相关论文
共 32 条
  • [21] FuseNet: a multi-modal feature fusion network for 3D shape classification
    Zhao, Xin
    Chen, Yinhuang
    Yang, Chengzhuan
    Fang, Lincong
    VISUAL COMPUTER, 2025, 41 (04) : 2973 - 2985
  • [22] Learning high-level features by fusing multi-view representation of MLS point clouds for 3D object recognition in road environments
    Luo, Zhipeng
    Li, Jonathan
    Xiao, Zhenlong
    Mou, Z. Geroge
    Cai, Xiaojie
    Wang, Cheng
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2019, 150 : 44 - 58
  • [23] Multi-view 3D model retrieval based on enhanced detail features with contrastive center loss
    Chen, Qiang
    Chen, Yinong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (08) : 10407 - 10426
  • [24] Multi-view self-supervised learning for 3D facial texture reconstruction from single image
    Zeng, Xiaoxing
    Hu, Ruyun
    Shi, Wu
    Qiao, Yu
    IMAGE AND VISION COMPUTING, 2021, 115
  • [25] Robust 3D Hand Pose Estimation From Single Depth Images Using Multi-View CNNs
    Ge, Liuhao
    Liang, Hui
    Yuan, Junsong
    Thalmann, Daniel
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (09) : 4422 - 4436
  • [26] SEMI-SUPERVISED LEARNING OF MONOCULAR 3D HAND POSE ESTIMATION FROM MULTI-VIEW IMAGES
    Mueller, Markus
    Poier, Georg
    Possegger, Horst
    Bischof, Horst
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1104 - 1108
  • [27] Deep learning-based multi-view 3D-human action recognition using skeleton and depth data
    Sampat Kumar Ghosh
    Rashmi M
    Biju R Mohan
    Ram Mohana Reddy Guddeti
    Multimedia Tools and Applications, 2023, 82 : 19829 - 19851
  • [28] A deep learning-based multi-view approach to automatic 3D landmarking and deformity assessment of lower limb
    Rostamian, Reyhaneh
    Panahi, Masoud Shariat
    Karimpour, Morad
    Kashani, Hadi G.
    Abi, Amirhossein
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [29] Deep learning-based multi-view 3D-human action recognition using skeleton and depth data
    Ghosh, Sampat Kumar
    Rashmi, M.
    Mohan, Biju R.
    Guddeti, Ram Mohana Reddy
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (13) : 19829 - 19851
  • [30] No-Reference 3D Point Cloud Quality Assessment Using Multi-View Projection and Deep Convolutional Neural Network
    Bourbia, Salima
    Karine, Ayoub
    Chetouani, Aladine
    El Hassouni, Mohammed
    Jridi, Maher
    IEEE ACCESS, 2023, 11 : 26759 - 26772