共 32 条
[1]
Alamri H., 2019, DSTC7 WORKSH AAAI
[2]
[Anonymous], 2011, P 49 ANN M ASS COMPU
[3]
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:4724-4733
[4]
Denkowski M., 2014, P 9 WORKSH STAT MACH, P376
[5]
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6546-6555
[6]
Hershey S, 2017, INT CONF ACOUST SPEE, P131, DOI 10.1109/ICASSP.2017.7952132
[7]
Hori C, 2019, INT CONF ACOUST SPEE, P2352, DOI [10.1109/icassp.2019.8682583, 10.1109/ICASSP.2019.8682583]
[8]
Attention-Based Multimodal Fusion for Video Description
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:4203-4212
[9]
Le H, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P5612
[10]
King DB, 2015, ACS SYM SER, V1214, P1, DOI 10.1021/bk-2015-1214.ch001