Learning Attentive and Hierarchical Representations for 3D Shape Recognition

被引:16
作者
Chen, Jiaxin [1 ]
Qin, Jie [1 ]
Shen, Yuming [3 ]
Liu, Li [1 ]
Zhu, Fan [1 ]
Shao, Ling [1 ,2 ]
机构
[1] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
[2] Mohamed bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
[3] eBay, Shanghai, Peoples R China
来源
COMPUTER VISION - ECCV 2020, PT XV | 2020年 / 12360卷
关键词
3D shape recognition; View-agnostic/specific attentions; Multi-granularity view aggregation; Hyperbolic neural networks; RETRIEVAL;
D O I
10.1007/978-3-030-58555-6_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel method for 3D shape representation learning, namely Hyperbolic Embedded Attentive Representation (HEAR). Different from existing multi-view based methods, HEAR develops a unified framework to address both multi-view redundancy and single-view incompleteness. Specifically, HEAR firstly employs a hybrid attention (HA) module, which consists of a view-agnostic attention (VAA) block and a view-specific attention (VSA) block. These two blocks jointly explore distinct but complementary spatial saliency of local features for each single-view image. Subsequently, a multi-granular view pooling (MVP) module is introduced to aggregate the multi-view features with different granularities in a coarse-to-fine manner. The resulting feature set implicitly has hierarchical relations, which are therefore projected into a Hyperbolic space by adopting the Hyperbolic embedding. A hierarchical representation is learned by Hyperbolic multi-class logistic regression based on the Hyperbolic geometry. Experimental results clearly show that HEAR outperforms the state-of-the-art approaches on three 3D shape recognition tasks including generic 3D shape retrieval, 3D shape classification and sketch-based 3D shape retrieval.
引用
收藏
页码:105 / 122
页数:18
相关论文
共 74 条
[51]   The Princeton shape benchmark [J].
Shilane, P ;
Min, P ;
Kazhdan, M ;
Funkhouser, T .
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SHAPE MODELING AND APPLICATIONS, 2004, :167-178
[52]  
Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556
[53]   Ensemble Diffusion for RetrievalEnsemble Diffusion for RetrievalEnsemble Diffusion for Retrieval [J].
Bai, Song ;
Zhou, Zhichao ;
Wang, Jingdong ;
Bai, Xiang ;
Latecki, Longin Jan ;
Tian, Qi .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :774-783
[54]   Sketch-based retrieval of drawings using spatial proximity [J].
Sousa, Pedro ;
Fonseca, Manuel J. .
JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2010, 21 (02) :69-80
[55]   Multi-view Convolutional Neural Networks for 3D Shape Recognition [J].
Su, Hang ;
Maji, Subhransu ;
Kalogerakis, Evangelos ;
Learned-Miller, Erik .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :945-953
[56]  
Su J.C., 2018, ECCV
[57]  
Szegedy C., 2015, P IEEE C COMP VIS PA, P1, DOI [10.1109/cvpr.2015.7298594, DOI 10.1109/CVPR.2015.7298594]
[58]  
Szegedy C, 2017, AAAI CONF ARTIF INTE, P4278
[59]   Learning shape retrieval from different modalities [J].
Tabia, Hedi ;
Laga, Hamid .
NEUROCOMPUTING, 2017, 253 :24-33
[60]   Shape2Vec: semantic-based descriptors for 3D shapes, sketches and images [J].
Tasse, Flora Ponjou ;
Dodgson, Neil .
ACM TRANSACTIONS ON GRAPHICS, 2016, 35 (06)