共 47 条
[21]
Dhillon I. S., 2004, P 10 ACM SIGKDD INT, P551, DOI DOI 10.1145/1014052.1014118
[23]
Learning Spatiotemporal Features with 3D Convolutional Networks
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:4489-4497
[24]
YouTube2Text: Recognizing and Describing Arbitrary Activities Using Semantic Hierarchies and Zero-shot Recognition
[J].
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2013,
:2712-2719
[25]
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
[26]
Jin Q, 2016, P 24 ACM INT C MULT, P1087, DOI [DOI 10.1145/2964284.2984065, 10.1145/2964284.2984065]
[27]
Video Description Generation using Audio and Visual Cues
[J].
ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL,
2016,
:239-242
[28]
King DB, 2015, ACS SYM SER, V1214, P1
[29]
Krizhevsky A., 2010, P MACH LEARN RES, P621
[30]
Lee S, 2016, ADV NEUR IN, V29