共 44 条
[1]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[2]
Neural Module Networks
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:39-48
[3]
Chen JY, 2021, PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, P4366
[4]
Zero-Shot Visual Question Answering Using Knowledge Graph
[J].
SEMANTIC WEB - ISWC 2021,
2021, 12922
:146-162
[5]
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[6]
Gao H., 2015, ADV NEURAL INFORM PR, V28, P2296, DOI DOI 10.48550/ARXIV.1505.05612
[7]
Gardères F, 2020, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, P489
[8]
Geng YX, 2021, Arxiv, DOI arXiv:2106.15047
[9]
Bi-directional Heterogeneous Graph Hashing towards Efficient Outfit Recommendation
[J].
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022,
2022,
[10]
Focal and Composed Vision-semantic Modeling for Visual Question Answering
[J].
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021,
2021,
:4528-4536