共 65 条
[1]
Hudson DA, 2018, Arxiv, DOI arXiv:1803.03067
[2]
Bhattacharyya A, 2022, LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P4944
[5]
nuScenes: A multimodal dataset for autonomous driving
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:11618-11628
[6]
End-to-End Object Detection with Transformers
[J].
COMPUTER VISION - ECCV 2020, PT I,
2020, 12346
:213-229
[7]
Grounding Commands for Autonomous Vehicles via Layer Fusion with Region-specific Dynamic Layer Attention
[J].
2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS),
2022,
:12464-12470
[8]
Chen XP, 2018, Arxiv, DOI arXiv:1812.03426
[9]
UNITER: UNiversal Image-TExt Representation Learning
[J].
COMPUTER VISION - ECCV 2020, PT XXX,
2020, 12375
:104-120
[10]
Cheng B., 2021, NeurIPS