Learning Attentive and Hierarchical Representations for 3D Shape Recognition

被引:16
作者
Chen, Jiaxin [1 ]
Qin, Jie [1 ]
Shen, Yuming [3 ]
Liu, Li [1 ]
Zhu, Fan [1 ]
Shao, Ling [1 ,2 ]
机构
[1] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
[2] Mohamed bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
[3] eBay, Shanghai, Peoples R China
来源
COMPUTER VISION - ECCV 2020, PT XV | 2020年 / 12360卷
关键词
3D shape recognition; View-agnostic/specific attentions; Multi-granularity view aggregation; Hyperbolic neural networks; RETRIEVAL;
D O I
10.1007/978-3-030-58555-6_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel method for 3D shape representation learning, namely Hyperbolic Embedded Attentive Representation (HEAR). Different from existing multi-view based methods, HEAR develops a unified framework to address both multi-view redundancy and single-view incompleteness. Specifically, HEAR firstly employs a hybrid attention (HA) module, which consists of a view-agnostic attention (VAA) block and a view-specific attention (VSA) block. These two blocks jointly explore distinct but complementary spatial saliency of local features for each single-view image. Subsequently, a multi-granular view pooling (MVP) module is introduced to aggregate the multi-view features with different granularities in a coarse-to-fine manner. The resulting feature set implicitly has hierarchical relations, which are therefore projected into a Hyperbolic space by adopting the Hyperbolic embedding. A hierarchical representation is learned by Hyperbolic multi-class logistic regression based on the Hyperbolic geometry. Experimental results clearly show that HEAR outperforms the state-of-the-art approaches on three 3D shape recognition tasks including generic 3D shape retrieval, 3D shape classification and sketch-based 3D shape retrieval.
引用
收藏
页码:105 / 122
页数:18
相关论文
共 74 条
[1]  
[Anonymous], 2018, AAAI 2019
[2]   GIFT: A Real-time and Scalable 3D Shape Search Engine [J].
Bai, Song ;
Bai, Xiang ;
Zhou, Zhichao ;
Zhang, Zhaoxiang ;
Latecki, Longin Jan .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5023-5032
[3]  
Becigneul G., 2019, Riemannian adaptive optimization methods
[4]  
Brock A., 2016, NeurIPS
[5]  
Chami I, 2019, ADV NEUR IN, V32
[6]  
Chatfield K, 2014, Arxiv, DOI arXiv:1405.3531
[7]   On visual similarity based 3D model retrieval [J].
Chen, DY ;
Tian, XP ;
Shen, YT ;
Ming, OY .
COMPUTER GRAPHICS FORUM, 2003, 22 (03) :223-232
[8]   Deep Sketch-Shape Hashing with Segmented 3D Stochastic Viewing [J].
Chen, Jiaxin ;
Qin, Jie ;
Liu, Li ;
Zhu, Fan ;
Shen, Fumin ;
Xie, Jin ;
Shao, Ling .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :791-800
[9]   Deep Cross-Modality Adaptation via Semantics Preserving Adversarial Learning for Sketch-Based 3D Shape Retrieval [J].
Chen, Jiaxin ;
Fang, Yi .
COMPUTER VISION - ECCV 2018, PT XIII, 2018, 11217 :624-640
[10]   Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].
Dai, Angela ;
Qi, Charles Ruizhongtai ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554