共 60 条
[1]
Al-Rfou Rami, 2019, P 33 AAAI C ART INT
[2]
ViViT: A Video Vision Transformer
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:6816-6826
[3]
3D Hand Shape and Pose from Images in the Wild
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:10835-10844
[4]
Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks
[J].
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019),
2019,
:2272-2281
[5]
Calli B, 2015, PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS (ICAR), P510, DOI 10.1109/ICAR.2015.7251504
[6]
Carion N, 2020, Img Proc Comp Vis Re, V12346, P213, DOI 10.1007/978-3-030-58452-8_13
[7]
DexYCB: A Benchmark for Capturing Hand Grasping of Objects
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:9040-9049
[8]
Cho K., 2014, Learning phrase representations using RNN encoderdecoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
[9]
2014, DOI DOI 10.3115/V1/D14-1179
[10]
Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:1964-1973