共 98 条
[81]
Vinyals O, 2015, PROC CVPR IEEE, P3156, DOI 10.1109/CVPR.2015.7298935
[82]
Vinyals Oriol, 2015, Advances in neural information processing systems, V28
[83]
Reconstruction Network for Video Captioning
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:7622-7631
[84]
M3: Multimodal Memory Modelling for Video Captioning
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:7512-7520
[85]
Learning Deep Structure-Preserving Image-Text Embeddings
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:5005-5013
[86]
Interpretable Video Captioning via Trajectory Structured Localization
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6829-6837
[87]
Aggregated Residual Transformations for Deep Neural Networks
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:5987-5995
[88]
Learning Multimodal Attention LSTM Networks for Video Captioning
[J].
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17),
2017,
:537-545
[89]
MSR-VTT: A Large Video Description Dataset for Bridging Video and Language
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:5288-5296
[90]
Xu K, 2015, PR MACH LEARN RES, V37, P2048