共 61 条
[1]
ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes
[J].
COMPUTER VISION - ECCV 2020, PT I,
2020, 12346
:422-440
[2]
TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022),
2022,
:1080-1089
[3]
SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences
[J].
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019),
2019,
:9296-9306
[4]
nuScenes: A multimodal dataset for autonomous driving
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:11618-11628
[5]
3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022),
2022,
:16443-16452
[6]
UniT3D: A Unified Transformer for 3D Dense Captioning and Visual Grounding
[J].
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023),
2023,
:18063-18073
[7]
ScanRefer: 3D Object Localization in RGB-D Scans Using Natural Language
[J].
COMPUTER VISION - ECCV 2020, PT XX,
2020, 12365
:202-221
[8]
Chen JM, 2023, Arxiv, DOI arXiv:2210.12513
[9]
Chen XY, 2023, Arxiv, DOI arXiv:2203.10642
[10]
Cho J, 2021, PR MACH LEARN RES, V139