共 39 条
[11]
Ghosh S, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P1984
[12]
Localizing Moments in Video with Natural Language
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:5804-5813
[13]
Hendricks Lisa Anne, 2018, EMNLP
[14]
Keeler James, 1991, ADV NEURAL INFORM PR, V3
[15]
Dense-Captioning Events in Videos
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:706-715
[16]
Lei Jie, 2020, P EUR C COMP VIS
[17]
BSN: Boundary Sensitive Network for Temporal Action Proposal Generation
[J].
COMPUTER VISION - ECCV 2018, PT IV,
2018, 11208
:3-21
[18]
Lin ZJ, 2020, AAAI CONF ARTIF INTE, V34, P11539
[19]
Cross-modal Moment Localization in Videos
[J].
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18),
2018,
:843-851
[20]
Minuk Ma, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12373), P156, DOI 10.1007/978-3-030-58604-1_10