共 70 条
- [1] Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6077 - 6086
- [2] SPICE: Semantic Propositional Image Caption Evaluation [J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 382 - 398
- [3] [Anonymous], 2015, P INT C LEARN REPR S
- [4] Banerjee Satanjeev, 2005, ACL WORKSHOPS, P65
- [5] Bansal N., 2018, P ADV NEUR INF PROC, P4266
- [6] Bhojanapalli S, 2021, arXiv
- [7] Chaorui Deng, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12358), P712, DOI 10.1007/978-3-030-58601-0_42
- [8] SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6298 - 6306
- [9] Chen XL, 2015, Arxiv, DOI [arXiv:1504.00325, DOI 10.48550/ARXIV.1504.00325]
- [10] Chen Z., 2024, arXiv, DOI arXiv:2404.16821