Selective Multi-View Deep Model for 3D Object Classification

被引:8
作者
Alzahrani, Mona [1 ,2 ]
Usman, Muhammad [1 ,3 ,4 ]
Anwar, Saeed [1 ,3 ]
Helmy, Tarek [1 ,4 ]
机构
[1] KFUPM, Dept Informat & Comp Sci, Dhahran, Saudi Arabia
[2] Jouf Univ, Coll Comp & Informat Sci, Sakaka, Saudi Arabia
[3] KFUPM, SDAIA KFUPM Joint Res Ctr Artificial Intelligence, Dhahran, Saudi Arabia
[4] KFUPM, Ctr Intelligent Secure Syst, Dhahran, Saudi Arabia
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW | 2024年
关键词
D O I
10.1109/CVPRW63382.2024.00077
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D object classification has emerged as a practical technology with applications in various domains, such as medical image analysis, automated driving, intelligent robots, and crowd surveillance. Among the different approaches, multi-view representations for 3D object classification have shown the most promising results, achieving state-of-theart performance. However, there are certain limitations in current view-based 3D object classification methods. One observation is that using all captured views for classification can confuse the classifier and lead to misleading results for certain classes. Additionally, some views may contain more discriminative information for object classification than others. These observations motivate the development of smarter and more efficient selective multi-view classification models. In this work, we propose a Selective MultiView Deep Model that extracts multi-view images from 3D data representations and selects the most influential view by assigning importance scores using the cosine similarity method based on visual features detected by a pre-trained CNN. The proposed method is evaluated on the ModelNet40 dataset for the task of 3D classification. The results demonstrate that the proposed model achieves an overall accuracy of 88.13% using only a single view when employing a shading technique for rendering the views, pre-trained ResNet152 as the backbone CNN for feature extraction, and a Fully Connected Network (FCN) as the classifier.
引用
收藏
页码:728 / 736
页数:9
相关论文
共 27 条
[1]  
Ahmed E, 2019, Arxiv, DOI arXiv:1808.01462
[2]  
Chen S, 2021, Arxiv, DOI arXiv:2110.13083
[3]   GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition [J].
Feng, Yifan ;
Zhang, Zizhao ;
Zhao, Xibin ;
Ji, Rongrong ;
Gao, Yue .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :264-272
[4]   MVTN: Multi-View Transformation Network for 3D Shape Recognition [J].
Hamdi, Abdullah ;
Giancola, Silvio ;
Ghanem, Bernard .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :1-11
[5]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[6]  
Huang ZY, 2019, AAAI CONF ARTIF INTE, P8505
[7]   RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints [J].
Kanezaki, Asako ;
Matsushita, Yasuyuki ;
Nishida, Yoshifumi .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5010-5019
[8]   Unpaired Translation of 3D Point Clouds with Multi-part Shape Representation [J].
Li, Chih-Chia ;
Lin, I-Chen .
PROCEEDINGS OF THE ACM ON COMPUTER GRAPHICS AND INTERACTIVE TECHNIQUES, 2023, 6 (01)
[9]  
Mezquita Yeray, 2023, Computer vision: A review on 3d object recognition, P117
[10]   ILLUMINATION FOR COMPUTER GENERATED PICTURES [J].
PHONG, BT .
COMMUNICATIONS OF THE ACM, 1975, 18 (06) :311-317