共 58 条
[1]
ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes
[J].
COMPUTER VISION - ECCV 2020, PT I,
2020, 12346
:422-440
[2]
[Anonymous], 2000, ORGAN SOUND, DOI [10.1017/S13557718 00003071, DOI 10.1017/S1355771800003071]
[3]
ScanQA: 3D Question Answering for Spatial Scene Understanding
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022),
2022,
:19107-19117
[4]
SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences
[J].
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019),
2019,
:9296-9306
[5]
3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022),
2022,
:16443-16452
[6]
ScanRefer: 3D Object Localization in RGB-D Scans Using Natural Language
[J].
COMPUTER VISION - ECCV 2020, PT XX,
2020, 12365
:202-221
[7]
FocalFormer3D: Focusing on Hard Instance for 3D Object Detection
[J].
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023),
2023,
:8360-8371
[8]
Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark
[J].
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2024,
:21886-21896
[9]
Cheng ZY, 2024, Arxiv, DOI arXiv:2304.14614
[10]
Collobert R., 2019, arXiv, DOI DOI 10.48550/ARXIV.1904.05862