共 37 条
[1]
[Anonymous], 2016, P 24 ACM INT C MULTI, DOI DOI 10.1145/2964284.2984066
[2]
[Anonymous], 1997, Neural Computation
[3]
Ba LJ, 2014, ADV NEUR IN, V27
[4]
Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]
[6]
A Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching
[J].
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2013,
:2634-2641
[7]
COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES
[J].
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING,
1980, 28 (04)
:357-366
[8]
Denkowski M., 2014, P 9 WORKSH STAT MACH
[9]
Learning Spatiotemporal Features with 3D Convolutional Networks
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:4489-4497
[10]
Semantic Compositional Networks for Visual Captioning
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:1141-1150