NeurNCD: Novel Class Discovery via Implicit Neural Representation

被引:0
作者
Wang, Junming [1 ]
Shi, Yi [2 ]
机构
[1] Univ Hong Kong, Hong Kong, Peoples R China
[2] Beijing Jiaotong Univ, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024 | 2024年
关键词
Neural Radiation Field; Visual Embedding Space; Novel Class Discovery; Feature Fusion; Novel View Synthesis;
D O I
10.1145/3652583.3658073
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discovering novel classes in open-world settings is crucial for real-world applications. Traditional explicit representations, such as object descriptors or 3D segmentation maps, are constrained by their discrete, hole-prone, and noisy nature, which hinders accurate novel class discovery. To address these challenges, we introduce NeurNCD, the first versatile and data-efficient framework for novel class discovery that employs the meticulously designed Embedding-NeRF model combined with KL divergence as a substitute for traditional explicit 3D segmentation maps to aggregate semantic embedding and entropy in visual embedding space. NeurNCD also integrates several key components, including feature query, feature modulation and clustering, facilitating efficient feature augmentation and information exchange between the pre-trained semantic segmentation network and implicit neural representations. As a result, our framework achieves superior segmentation performance in both open and closed-world settings without relying on densely labelled datasets for supervised training or human interaction to generate sparse label supervision. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art approaches on the NYUv2 and Replica datasets.
引用
收藏
页码:257 / 265
页数:9
相关论文
共 43 条
  • [1] Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields
    Barron, Jonathan T.
    Mildenhall, Ben
    Verbin, Dor
    Srinivasan, Pratul P.
    Hedman, Peter
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5460 - 5469
  • [2] SCIM: Simultaneous Clustering, Inference, and Mapping for Open-World Semantic Scene Understanding
    Blum, Hermann
    Mueller, Marcus G.
    Gawel, Abel
    Siegwart, Roland
    Cadena, Cesar
    [J]. ROBOTICS RESEARCH, ISRR 2022, 2023, 27 : 119 - 135
  • [3] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [4] Chen Z., 2022, arXiv
  • [5] Couprie C., 2013, Indoor semantic segmentation using depth information, P1
  • [6] Continual Adaptation of Semantic Segmentation Using Complementary 2D-3D Data Representations
    Frey, Jonas
    Blum, Hermann
    Milano, Francesco
    Siegwart, Roland
    Cadena, Cesar
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 11665 - 11672
  • [7] Fu X, 2022, Arxiv, DOI arXiv:2203.15224
  • [8] Furrer F, 2018, IEEE INT C INT ROBOT, P6835, DOI 10.1109/IROS.2018.8594391
  • [9] Neural 3D Scene Reconstruction with the Manhattan-world Assumption
    Guo, Haoyu
    Peng, Sida
    Lin, Haotong
    Wang, Qianqian
    Zhang, Guofeng
    Bao, Hujun
    Zhou, Xiaowei
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5501 - 5510
  • [10] Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation
    Gupta, Saurabh
    Arbelaez, Pablo
    Girshick, Ross
    Malik, Jitendra
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 112 (02) : 133 - 149