共 22 条
[22]
MSR-VTT: A Large Video Description Dataset for Bridging Video and Language
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:5288-5296