共 66 条
[11]
NODIS: Neural Ordinary Differential Scene Understanding
[J].
COMPUTER VISION - ECCV 2020, PT XX,
2020, 12365
:636-653
[12]
Detecting Visual Relationships with Deep Relational Networks
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:3298-3308
[13]
Devlin J., 2018, arXiv:1810.04805
[14]
Donahue J, 2015, PROC CVPR IEEE, P2625, DOI 10.1109/CVPR.2015.7298878
[15]
Dosovitskiy Alexey, 2021, 9 INT C LEARN REPR I
[16]
Learning Spatiotemporal Features with 3D Convolutional Networks
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:4489-4497
[17]
Image Captioning with Scene-graph Based Semantic Concepts
[J].
PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018),
2018,
:225-229
[18]
Video Action Transformer Network
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:244-253
[19]
Detecting and Recognizing Human-Object Interactions
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:8359-8367
[20]
He K., 2017, PROC IEEE INT C COMP, P2961