共 25 条
[11]
Nagrani A., 2021, ADV NEURAL INF PROCE
[12]
End-to-End Video Captioning
[J].
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW),
2019,
:1474-1482
[13]
Translating Video Content to Natural Language Descriptions
[J].
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2013,
:433-440
[14]
Ryu H, 2021, AAAI CONF ARTIF INTE, V35, P2514
[15]
Tsotsos J.K, 2021, A computational perspective on visual attention
[16]
ANALYZING VISION AT THE COMPLEXITY LEVEL
[J].
BEHAVIORAL AND BRAIN SCIENCES,
1990, 13 (03)
:423-444
[18]
Vaswani A, 2017, ADV NEUR IN, V30
[19]
Sequence to Sequence - Video to Text
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:4534-4542
[20]
How Blind People Interact with Visual Content on Social Networking Services
[J].
ACM CONFERENCE ON COMPUTER-SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING (CSCW 2016),
2016,
:1584-1595