共 59 条
[1]
Al-Rfou R., 2019, P 33 AAAI C ART INT
[2]
[Anonymous], 2021, INT C MACH LEARN
[3]
ViViT: A Video Vision Transformer
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:6816-6826
[4]
3D Hand Shape and Pose from Images in the Wild
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:10835-10844
[5]
Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks
[J].
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019),
2019,
:2272-2281
[6]
Calli B, 2015, PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS (ICAR), P510, DOI 10.1109/ICAR.2015.7251504
[7]
End-to-End Object Detection with Transformers
[J].
COMPUTER VISION - ECCV 2020, PT I,
2020, 12346
:213-229
[8]
DexYCB: A Benchmark for Capturing Hand Grasping of Objects
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:9040-9049
[9]
Cho K., 2014, LEARNING PHRASE REPR, DOI [10.3115/v1/D14-1179, DOI 10.3115/V1/D14-1179]
[10]
Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:1964-1973