共 67 条
[1]
Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:12479-12488
[2]
[Anonymous], 2012, Association for Computational Linguistics
[3]
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:1708-1718
[4]
Banerjee S., 2005, P ACL WORKSHOP INTRI, P65, DOI DOI 10.3115/1626355.1626389
[5]
Barbu Andrei., 2012, Proceedings of the Conference on Uncertainty in Artificial Intelligence UAI, P102
[6]
Chen D.L., 2011, ACL, V1, P190
[7]
Chen SX, 2019, AAAI CONF ARTIF INTE, P8191
[8]
Chen XL, 2015, Arxiv, DOI arXiv:1504.00325
[9]
Dai B, 2017, ADV NEUR IN, V30
[10]
A Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching
[J].
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2013,
:2634-2641