共 53 条
- [1] Aggarwal G., 2021, arXiv
- [2] Ahn H, 2018, IEEE INT CONF ROBOT, P5915
- [3] Language2Pose: Natural Language Grounded Pose Forecasting [J]. 2019 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2019), 2019, : 719 - 728
- [5] Athanasiou N., 2022, P INT C 3D VIS 3DV P
- [6] Brown TB, 2020, ADV NEUR IN, V33
- [7] Cai ZA, 2023, Arxiv, DOI arXiv:2204.13686
- [8] Cai ZA, 2024, Arxiv, DOI arXiv:2110.07588
- [9] Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3557 - 3567
- [10] Executing your Commands via Motion Diffusion in Latent Space [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18000 - 18010