共 63 条
[1]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[2]
VQA: Visual Question Answering
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:2425-2433
[3]
Bi-Modal Transformer-Based Approach for Visual Question Answering in Remote Sensing Imagery
[J].
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING,
2022, 60
[4]
Ben-Younes H, 2019, AAAI CONF ARTIF INTE, P8102
[5]
MUTAN: Multimodal Tucker Fusion for Visual Question Answering
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:2631-2639
[6]
End-to-End Object Detection with Transformers
[J].
COMPUTER VISION - ECCV 2020, PT I,
2020, 12346
:213-229
[7]
Chappuis C., 2022, P IEEE CVF C COMP VI, P1372
[9]
Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images
[J].
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING,
2016, 54 (12)
:7405-7415