Semantic and Context Information Fusion Network for View-Based 3D Model Classification and Retrieval

被引：13

作者：

Liu, An-An ^{[1
]}

Guo, Fu-Bin ^{[1
]}

Zhou, He-Yu ^{[1
]}

Li, Wen-Hui ^{[1
]}

Song, Dan ^{[1
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

来源：

IEEE ACCESS | 2020年 / 8卷

基金：

中国国家自然科学基金;

关键词：

Three-dimensional displays; Solid modeling; Task analysis; Semantics; Computational modeling; Feature extraction; Two dimensional displays; 3D model; semantic information; context information; CNN; CONVOLUTIONAL NEURAL-NETWORK; OBJECT RECOGNITION; FEATURES;

D O I：

10.1109/ACCESS.2020.3018875

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, with the rapid development of 3D technology, view-based methods have shown excellent performance in both 3D model classification and retrieval tasks. In view-based methods, how to aggregate multi-view features is a key issue. There are two commonly used solutions in the existing methods: 1) Use pooling strategy to merge multi-view features, but it ignores the context information contained in the continuous view sequence. 2) Leverage grouping strategy or long short term memory networks (LSTM) to select representative views of the 3D model, however, it easily neglects the semantic information of individual views. In this paper, we propose a novel Semantic and Context information Fusion Network (SCFN) to compensate for these drawbacks. First, we render views from multiple perspectives of the 3D model and extract the raw feature of the individual view by 2D convolutional neural networks (CNN). Then we design the channel attention mechanism (CAM) to exploit the view-wise semantic information. By modeling the correlation among view feature channels, we can assign higher weights to useful feature attributes, while suppressing the useless. Next, we propose a context information fusion module (CFM) to fuse multiple view features to obtain a compact 3D representation. Extensive experiments are conducted on three popular datasets, i.e., ModelNet10, ModelNet40, and ShapeNetCore55, which can demonstrate the superiority of the proposed method comparing to the state-of-the-arts on both 3D classification and retrieval tasks.

引用

页码：155939 / 155950

页数：12

共 51 条

[1] Multi-Scale Shape Index-Based Local Binary Patterns for Texture Classification [J].

Alpaslan, Nuh ;

Hanbay, Kazim .

IEEE SIGNAL PROCESSING LETTERS, 2020, 27 :660-664

[2] Multi-Resolution Intrinsic Texture Geometry-Based Local Binary Pattern for Texture Classification [J].

Alpaslan, Nuh ;

Hanbay, Kazim .

IEEE ACCESS, 2020, 8 :54415-54430

[3]

[Anonymous], 2016, ARXIV160804236

[4] GIFT: A Real-time and Scalable 3D Shape Search Engine [J].

Bai, Song ;

Bai, Xiang ;

Zhou, Zhichao ;

Zhang, Zhaoxiang ;

Latecki, Longin Jan .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5023-5032

[5] On visual similarity based 3D model retrieval [J].

Chen, DY ;

Tian, XP ;

Shen, YT ;

Ming, OY .

COMPUTER GRAPHICS FORUM, 2003, 22 (03) :223-232

[6] MMALFM: Explainable Recommendation by Leveraging Reviews and Images [J].

Cheng, Zhiyong ;

Chang, Xiaojun ;

Zhu, Lei ;

Kanjirathinkal, Rose C. ;

Kankanhalli, Mohan .

ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2019, 37 (02)

[7] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[8]

Feng Y., 2018, PeerJ Prepr, P1, DOI DOI 10.1109/NSSMIC.2018.8824767

[9]

Godil A., 2009, P IEEE APPL IM PATT, P1

[10] 3D Object Recognition in Cluttered Scenes with Local Surface Features: A Survey [J].

Guo, Yulan ;

Bennamoun, Mohammed ;

Sohel, Ferdous ;

Lu, Min ;

Wan, Jianwei .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (11) :2270-2287

← 1 2 3 4 5 6 →