共 41 条
[2]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[3]
Banerjee Satanjeev, 2005, P ACL WORKSHOP INTRI, P65
[4]
Beltagy I, 2020, Arxiv, DOI arXiv:2004.05150
[6]
Chen C.-F., 2021, P INT C LEARN REPR, P1
[7]
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:347-356
[8]
Child R, 2019, Arxiv, DOI arXiv:1904.10509
[9]
Chouaf S., 2021, PROC IEEE INT GEOSCI, P2891
[10]
Chu XX, 2021, ADV NEUR IN