共 64 条
[1]
Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:12479-12488
[3]
[Anonymous], 2017, P CVPR
[4]
[Anonymous], 2011, Association for Computational Linguistics
[5]
Banerjee S., 2005, P ACL WORKSH INTR EX, P228
[6]
Bertasius G, 2021, PR MACH LEARN RES, V139
[7]
Bi Jing, 2021, P INT C COMP VIS ICC
[8]
Brown TB, 2020, ADV NEUR IN, V33
[9]
Heilbron FC, 2015, PROC CVPR IEEE, P961, DOI 10.1109/CVPR.2015.7298698
[10]
End-to-End Object Detection with Transformers
[J].
COMPUTER VISION - ECCV 2020, PT I,
2020, 12346
:213-229