3D Semantic Novelty Detection via Large-Scale Pre-Trained Models

被引:0
|
作者
Rabino, Paolo [1 ,2 ]
Alliegro, Antonio [1 ]
Tommasi, Tatiana [1 ]
机构
[1] Polytech Univ Turin, Dept Control & Comp Engn, I-10129 Turin, Italy
[2] Italian Inst Technol, I-16163 Genoa, Italy
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Three-dimensional displays; Semantics; Feature extraction; Solid modeling; Point cloud compression; Anomaly detection; Data models; 3D point clouds; semantic novelty detection; out-of-distribution detection; training-free;
D O I
10.1109/ACCESS.2024.3464334
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Shifting deep learning models from lab environments to real-world settings entails preparing them to handle unforeseen conditions, including the chance of encountering novel objects from classes that were not included in their training data. Such occurrences can pose serious threats in various applications. The task of Semantic Novelty detection has attracted significant attention in the last years mainly on 2D images, overlooking the complex 3D nature of the real-world. In this study, we address this gap by examining the geometric structures of objects within 3D point clouds to detect semantic novelty effectively. We advance the field by introducing 3D-SeND, a method that harnesses a large-scale pre-trained model to extract patch-based object representations directly from its intermediate feature representation. These patches are used to characterize each known class precisely. At inference, a normalcy score is obtained by assessing whether a test sample can be reconstructed predominantly from patches of a single known class or from multiple classes. We evaluate 3D-SeND on real-world point cloud samples when the reference known data are synthetic and demonstrate that it excels in both standard and few-shot scenarios. Thanks to its patch-based object representation, it is possible to visualize 3D-SeND's predictions with a valuable explanation of the decision process. Moreover, the inherent training-free nature of 3D-SeND allows for its immediate application to a wide array of real-world tasks, offering a compelling advantage over approaches that require a task-specific learning phase. Our code is available at https://paolotron.github.io/3DSend.github.io.
引用
收藏
页码:135352 / 135361
页数:10
相关论文
共 50 条
  • [21] The ChatGPT After: Opportunities and Challenges of Very Large Scale Pre-trained Models
    Lu J.-W.
    Guo C.
    Dai X.-Y.
    Miao Q.-H.
    Wang X.-X.
    Yang J.
    Wang F.-Y.
    Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (04): : 705 - 717
  • [22] Semantic Scene Difference Detection in Daily Life Patroling by Mobile Robots using Pre-Trained Large-Scale Vision-Language Model
    Obinata, Yoshiki
    Kawaharazuka, Kento
    Kanazawa, Naoaki
    Yamaguchi, Naoya
    Tsukamoto, Naoto
    Yanokura, Iori
    Kitagawa, Shingo
    Shinjo, Koki
    Okada, Kei
    Inaba, Masayuki
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 3228 - 3233
  • [23] 3D Semantic Parsing of Large-Scale Indoor Spaces
    Armeni, Iro
    Sener, Ozan
    Zamir, Amir R.
    Jiang, Helen
    Brilakis, Ioannis
    Fischer, Martin
    Savarese, Silvio
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1534 - 1543
  • [24] MaskDiffusion: Exploiting Pre-Trained Diffusion Models for Semantic Segmentation
    Kawano, Yasufumi
    Aoki, Yoshimitsu
    IEEE ACCESS, 2024, 12 : 127283 - 127293
  • [25] MirageRoom: 3D Scene Segmentation with 2D Pre-trained Models by Mirage Projection
    Sun, Haowen
    Duan, Yueqi
    Yan, Juncheng
    Liu, Yifan
    Lu, Jiwen
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 20237 - 20246
  • [26] Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models
    Chung, Hyungjin
    Ryu, Dohoon
    Mccann, Michael T.
    Klasky, Marc L.
    Ye, Jong Chul
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22542 - 22551
  • [27] Machine Unlearning of Pre-trained Large Language Models
    Yao, Jin
    Chien, Eli
    Du, Minxin
    Niu, Xinyao
    Wang, Tianhao
    Cheng, Zezhou
    Yue, Xiang
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 8403 - 8419
  • [28] 3D Object Detection on large-scale dataset
    Zhao, Yan
    Zhu, Jihong
    Liang, Haoyu
    Chen, Lyujie
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [29] Solving Large-Scale Pursuit-Evasion Games Using Pre-trained Strategies
    Li, Shuxin
    Wang, Xinrun
    Zhang, Youzhi
    Xue, Wanqi
    Cerny, Jakub
    An, Bo
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 10, 2023, : 11586 - 11594
  • [30] Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders
    Zhang, Renrui
    Wang, Liuhui
    Qiao, Yu
    Gao, Peng
    Li, Hongsheng
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21769 - 21780