LangSplat: 3D Language Gaussian Splatting

被引:7
|
作者
Qin, Minghan [1 ]
Li, Wanhua [2 ]
Zhou, Jiawei [1 ]
Wang, Haoqian [1 ]
Pfister, Hanspeter [2 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Harvard Univ, Cambridge, MA USA
关键词
D O I
10.1109/CVPR52733.2024.01895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Humans live in a 3D world and commonly use natural language to interact with a 3D scene. Modeling a 3D language field to support open-ended language queries in 3D has gained increasing attention recently. This paper introduces LangSplat, which constructs a 3D language field that enables precise and efficient open-vocabulary querying within 3D spaces. Unlike existing methods that ground CLIP language embeddings in a NeRF model, LangSplat advances the field by utilizing a collection of 3D Gaussians, each encoding language features distilled from CLIP, to represent the language field. By employing a tile-based splatting technique for rendering language features, we circumvent the costly rendering process inherent in NeRF. Instead of directly learning CLIP embeddings, LangSplat first trains a scene-wise language autoencoder and then learns language features on the scene-specific latent space, thereby alleviating substantial memory demands imposed by explicit modeling. Existing methods struggle with imprecise and vague 3D language fields, which fail to discern clear boundaries between objects. We delve into this issue and propose to learn hierarchical semantics using SAM, thereby eliminating the need for extensively querying the language field across various scales and the regularization of DINO features. Extensive experimental results show that LangSplat significantly outperforms the previous state-of-the-art method LERF by a large margin. Notably, LangSplat is extremely efficient, achieving a 199 x speedup compared to LERF at the resolution of 1440 x 1080. We strongly recommend readers to check out our video results at https://langsplat.github.io/.
引用
收藏
页码:20051 / 20060
页数:10
相关论文
共 50 条
  • [41] Scene reconstruction techniques for autonomous driving: a review of 3D Gaussian splatting
    Zhu, Huixin
    Zhang, Zhili
    Zhao, Junyang
    Duan, Hui
    Ding, Yao
    Xiao, Xiongwu
    Yuan, Junsong
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 58 (01)
  • [42] Large-Scale 3D Terrain Reconstruction Using 3D Gaussian Splatting for Visualization and Simulation
    Chen, Meida
    Lal, Devashish
    Yu, Zifan
    Xu, Jiuyi
    Feng, Andrew
    You, Suya
    Nurunnabi, Abdul
    Shi, Yangming
    MID-TERM SYMPOSIUM THE ROLE OF PHOTOGRAMMETRY FOR A SUSTAINABLE WORLD, VOL. 48-2, 2024, : 49 - 54
  • [43] Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal
    Wang, Yuxin
    Wu, Qianyi
    Zhang, Guofeng
    Xu, Dan
    COMPUTER VISION - ECCV 2024, PT III, 2025, 15061 : 1 - 17
  • [44] GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction
    Mu, Yuxuan
    Zuo, Xinxin
    Guo, Chuan
    Wang, Yilin
    Lu, Juwei
    Wu, Xiaofeng
    Xu, Songcen
    Dai, Peng
    Yan, Youliang
    Cheng, Li
    COMPUTER VISION - ECCV 2024, PT LXXIX, 2025, 15137 : 55 - 72
  • [45] Realistic Surgical Image Dataset Generation Based on 3D Gaussian Splatting
    Zeng, Tianle
    Galindo, Gerardo Loza
    Hu, Junlei
    Valdastri, Pietro
    Jones, Dominic
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT VI, 2024, 15006 : 510 - 519
  • [46] Geometry enhanced 3D Gaussian Splatting for high quality deferred rendering
    Wang, Shuo
    Xie, Cong
    Wang, Shengdong
    Jiao, Shaohui
    PROCEEDINGS OF THE SIGGRAPH 2024 POSTERS, 2024,
  • [47] Toward Realistic 3D Avatar Generation with Dynamic 3D Gaussian Splatting for AR/VR Communication
    Song, Hail
    2024 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES ABSTRACTS AND WORKSHOPS, VRW 2024, 2024, : 1124 - 1125
  • [48] The Potential of Neural Radiance Fields and 3D Gaussian Splatting for 3D Reconstruction from Aerial Imagery
    Haitz, Dennis
    Hermann, Max
    Roth, Aglaja Solana
    Weinmann, Michael
    Weinmann, Martin
    ISPRS ANNALS OF THE PHOTOGRAMMETRY, REMOTE SENSING AND SPATIAL INFORMATION SCIENCES: VOLUME X-2-2024, 2024, : 97 - 104
  • [49] GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting
    Zhang, Kai
    Bi, Sai
    Tan, Hao
    Xiang, Yuanbo
    Zhao, Nanxuan
    Sunkavalli, Kalyan
    Xu, Zexiang
    COMPUTER VISION-ECCV 2024, PT XXII, 2025, 15080 : 1 - 19
  • [50] 3D reconstruction of non-cooperative space targets of poor lighting based on 3D gaussian splatting
    Yibin Zhao
    Jianjun Yi
    Yihan Pan
    Liwei Chen
    Signal, Image and Video Processing, 2025, 19 (6)