Multi-View Saliency Guided Deep Neural Network for 3-D Object Retrieval and Classification

被引:56
作者
Zhou, He-Yu [1 ]
Liu, An-An [1 ]
Nie, Wei-Zhi [1 ]
Nie, Jie [2 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Ocean Univ China, Coll Informat Sci & Engn, Qingdao 266100, Peoples R China
关键词
Three-dimensional displays; Solid modeling; Visualization; Cameras; Feature extraction; Computational modeling; Shape; 3D object retrieval; 3D object classification; multi-view learning; saliency analysis; 3D MODEL RETRIEVAL; DESCRIPTORS; RECOGNITION;
D O I
10.1109/TMM.2019.2943740
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose the multi-view saliency guided deep neural network (MVSG-DNN) for 3D object retrieval and classification. This method mainly consists of three key modules. First, the module of model projection rendering is employed to capture the multiple views of one 3D object. Second, the module of visual context learning applies the basic Convolutional Neural Networks for visual feature extraction of individual views and then employs the saliency LSTM to adaptively select the representative views based on multi-view context. Finally, with these information, the module of multi-view representation learning can generate the compile 3D object descriptors with the designed classification LSTM for 3D object retrieval and classification. The proposed MVSG-DNN has two main contributions: 1) It can jointly realize the selection of representative views and the similarity measure by fully exploiting multi-view context; 2) It can discover the discriminative structure of multi-view sequence without constraints of specific camera settings. Consequently, it can support flexible 3D object retrieval and classification for real applications by avoiding the required camera settings. Extensive comparison experiments on ModelNet10, ModelNet40, and ShapeNetCore55 demonstrate the superiority of MVSG-DNN against the state-of-art methods.
引用
收藏
页码:1496 / 1506
页数:11
相关论文
共 47 条
  • [31] Lightweight binary voxel shape features for 3D data matching and retrieval
    Matsuda, Takahiro
    Furuya, Takahiko
    Ohbuchi, Ryutarou
    [J]. 2015 1ST IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2015, : 100 - 107
  • [32] Maturana D, 2015, IEEE INT C INT ROBOT, P922, DOI 10.1109/IROS.2015.7353481
  • [33] VISUAL LEARNING AND RECOGNITION OF 3-D OBJECTS FROM APPEARANCE
    MURASE, H
    NAYAR, SK
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 1995, 14 (01) : 5 - 24
  • [34] Ohbuchi R, 2008, IEEE INTERNATIONAL CONFERENCE ON SHAPE MODELING AND APPLICATIONS 2008, PROCEEDINGS, P93, DOI 10.1109/SMI.2008.4547955
  • [35] Qi CR, 2017, ADV NEUR IN, V30
  • [36] Volumetric and Multi-View CNNs for Object Classification on 3D Data
    Qi, Charles R.
    Su, Hao
    Niessner, Matthias
    Dai, Angela
    Yan, Mengyuan
    Guibas, Leonidas J.
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 5648 - 5656
  • [37] Sketch Classification and Classification-driven Analysis using Fisher Vectors
    Schneider, Rosalia G.
    Tuytelaars, Tinne
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2014, 33 (06):
  • [38] Multiview Hessian Semisupervised Sparse Feature Selection for Multimedia Analysis
    Shi, Caijuan
    An, Gaoyun
    Zhao, Ruizhen
    Ruan, Qiuiqi
    Tian, Qi
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (09) : 1947 - 1961
  • [39] A new 3D model retrieval approach based on the elevation descriptor
    Shih, Jau-Ling
    Lee, Chang-Hsing
    Wang, Jian Tang
    [J]. PATTERN RECOGNITION, 2007, 40 (01) : 283 - 295
  • [40] Deep Learning 3D Shape Surfaces Using Geometry Images
    Sinha, Ayan
    Bai, Jing
    Ramani, Karthik
    [J]. COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 : 223 - 240