LangSplat: 3D Language Gaussian Splatting

被引:7
|
作者
Qin, Minghan [1 ]
Li, Wanhua [2 ]
Zhou, Jiawei [1 ]
Wang, Haoqian [1 ]
Pfister, Hanspeter [2 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Harvard Univ, Cambridge, MA USA
关键词
D O I
10.1109/CVPR52733.2024.01895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Humans live in a 3D world and commonly use natural language to interact with a 3D scene. Modeling a 3D language field to support open-ended language queries in 3D has gained increasing attention recently. This paper introduces LangSplat, which constructs a 3D language field that enables precise and efficient open-vocabulary querying within 3D spaces. Unlike existing methods that ground CLIP language embeddings in a NeRF model, LangSplat advances the field by utilizing a collection of 3D Gaussians, each encoding language features distilled from CLIP, to represent the language field. By employing a tile-based splatting technique for rendering language features, we circumvent the costly rendering process inherent in NeRF. Instead of directly learning CLIP embeddings, LangSplat first trains a scene-wise language autoencoder and then learns language features on the scene-specific latent space, thereby alleviating substantial memory demands imposed by explicit modeling. Existing methods struggle with imprecise and vague 3D language fields, which fail to discern clear boundaries between objects. We delve into this issue and propose to learn hierarchical semantics using SAM, thereby eliminating the need for extensively querying the language field across various scales and the regularization of DINO features. Extensive experimental results show that LangSplat significantly outperforms the previous state-of-the-art method LERF by a large margin. Notably, LangSplat is extremely efficient, achieving a 199 x speedup compared to LERF at the resolution of 1440 x 1080. We strongly recommend readers to check out our video results at https://langsplat.github.io/.
引用
收藏
页码:20051 / 20060
页数:10
相关论文
共 50 条
  • [1] Deblurring 3D Gaussian Splatting
    Lee, Byeonghyeon
    Lee, Howoong
    Sun, Xiangyu
    Alit, Usman
    Park, Eunbyung
    COMPUTER VISION - ECCV 2024, PT LVIII, 2025, 15116 : 127 - 143
  • [2] 3D Gaussian Splatting with Deferred Reflection
    Ye, Keyang
    Hou, Qiming
    Zhou, Kun
    PROCEEDINGS OF SIGGRAPH 2024 CONFERENCE PAPERS, 2024,
  • [3] Recent advances in 3D Gaussian splatting
    Wu, Tong
    Yuan, Yu-Jie
    Zhang, Ling-Xiao
    Yang, Jie
    Cao, Yan-Pei
    Yan, Ling-Qi
    Gao, Lin
    COMPUTATIONAL VISUAL MEDIA, 2024, 10 (04) : 613 - 642
  • [4] Mip-Splatting: Alias-free 3D Gaussian Splatting
    Yu, Zehao
    Chen, Anpei
    Huang, Binbin
    Sattler, Torsten
    Geiger, Andreas
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 19447 - 19456
  • [5] GaussianGrasper: 3D Language Gaussian Splatting for Open-Vocabulary Robotic Grasping
    Zheng, Yuhang
    Chen, Xiangyu
    Zheng, Yupeng
    Gu, Songen
    Yang, Runyi
    Jin, Bu
    Li, Pengfei
    Zhong, Chengliang
    Wang, Zengmao
    Liu, Lina
    Yang, Chao
    Wang, Dawei
    Chen, Zhen
    Long, Xiaoxiao
    Wang, Meiqing
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (09): : 7827 - 7834
  • [6] COLMAP-Free 3D Gaussian Splatting
    Fu, Yang
    Liu, Sifei
    Kulkarni, Amey
    Kautz, Jan
    Efros, Alexei A.
    Wang, Xiaolong
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 20796 - 20805
  • [7] GaussReg: Fast 3D Registration with Gaussian Splatting
    Change, Jiahao
    Xu, Yinglin
    Li, Yihao
    Chen, Yuantao
    Feng, Wensen
    Han, Xiaoguang
    COMPUTER VISION - ECCV 2024, PT XV, 2025, 15073 : 407 - 423
  • [8] Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections
    Zhang, Dongbin
    Wang, Chuming
    Wang, Weitao
    Li, Peihao
    Qin, Minghan
    Wang, Haoqian
    COMPUTER VISION - ECCV 2024, PT LXXVI, 2025, 15134 : 341 - 359
  • [9] Privacy-Preserving 3D Gaussian Splatting
    Ali, Usman
    2024 IEEE GAMING, ENTERTAINMENT, AND MEDIA CONFERENCE, GEM 2024, 2024, : 385 - 387
  • [10] Reducing the Memory Footprint of 3D Gaussian Splatting
    Papantonakis, Panagiotis
    Kopanas, Georgios
    Kerbl, Bernhard
    Lanvin, Alexandre
    Drettakis, George
    PROCEEDINGS OF THE ACM ON COMPUTER GRAPHICS AND INTERACTIVE TECHNIQUES, 2024, 7 (01)