NeurNCD: Novel Class Discovery via Implicit Neural Representation

被引:0
作者
Wang, Junming [1 ]
Shi, Yi [2 ]
机构
[1] Univ Hong Kong, Hong Kong, Peoples R China
[2] Beijing Jiaotong Univ, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024 | 2024年
关键词
Neural Radiation Field; Visual Embedding Space; Novel Class Discovery; Feature Fusion; Novel View Synthesis;
D O I
10.1145/3652583.3658073
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discovering novel classes in open-world settings is crucial for real-world applications. Traditional explicit representations, such as object descriptors or 3D segmentation maps, are constrained by their discrete, hole-prone, and noisy nature, which hinders accurate novel class discovery. To address these challenges, we introduce NeurNCD, the first versatile and data-efficient framework for novel class discovery that employs the meticulously designed Embedding-NeRF model combined with KL divergence as a substitute for traditional explicit 3D segmentation maps to aggregate semantic embedding and entropy in visual embedding space. NeurNCD also integrates several key components, including feature query, feature modulation and clustering, facilitating efficient feature augmentation and information exchange between the pre-trained semantic segmentation network and implicit neural representations. As a result, our framework achieves superior segmentation performance in both open and closed-world settings without relying on densely labelled datasets for supervised training or human interaction to generate sparse label supervision. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art approaches on the NYUv2 and Replica datasets.
引用
收藏
页码:257 / 265
页数:9
相关论文
共 43 条
[1]  
[Anonymous], 2013, Indoor semantic segmentation using depth information
[2]   Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields [J].
Barron, Jonathan T. ;
Mildenhall, Ben ;
Verbin, Dor ;
Srinivasan, Pratul P. ;
Hedman, Peter .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :5460-5469
[3]   SCIM: Simultaneous Clustering, Inference, and Mapping for Open-World Semantic Scene Understanding [J].
Blum, Hermann ;
Mueller, Marcus G. ;
Gawel, Abel ;
Siegwart, Roland ;
Cadena, Cesar .
ROBOTICS RESEARCH, ISRR 2022, 2023, 27 :119-135
[4]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[5]   Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [J].
Chen, Xiaokang ;
Lin, Kwan-Yee ;
Wang, Jingbo ;
Wu, Wayne ;
Qian, Chen ;
Li, Hongsheng ;
Zeng, Gang .
COMPUTER VISION - ECCV 2020, PT XI, 2020, 12356 :561-577
[6]  
Chen Z., 2022, arXiv
[7]   Continual Adaptation of Semantic Segmentation Using Complementary 2D-3D Data Representations [J].
Frey, Jonas ;
Blum, Hermann ;
Milano, Francesco ;
Siegwart, Roland ;
Cadena, Cesar .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) :11665-11672
[8]  
Fu X, 2022, Arxiv, DOI arXiv:2203.15224
[9]  
Furrer F, 2018, IEEE INT C INT ROBOT, P6835, DOI 10.1109/IROS.2018.8594391
[10]   Neural 3D Scene Reconstruction with the Manhattan-world Assumption [J].
Guo, Haoyu ;
Peng, Sida ;
Lin, Haotong ;
Wang, Qianqian ;
Zhang, Guofeng ;
Bao, Hujun ;
Zhou, Xiaowei .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :5501-5510