共 31 条
[1]
Multimodal Engagement Prediction in Multiperson Human-Robot Interaction
[J].
IEEE ACCESS,
2022, 10
:61980-61991
[2]
ViViT: A Video Vision Transformer
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:6816-6826
[4]
Bhatia S., 2023, 2023 6 INT C INF SYS, P1
[6]
Chang A. T.-d.-P. Jen-Yen, 2019 INT C ROB AUT I
[7]
Donahue J, 2015, PROC CVPR IEEE, P2625, DOI 10.1109/CVPR.2015.7298878
[8]
Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[9]
FlowNet: Learning Optical Flow with Convolutional Networks
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:2758-2766
[10]
Learning Spatiotemporal Features with 3D Convolutional Networks
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:4489-4497