共 58 条
[1]
Aggarwal S, 2020, IEEE WINT CONF APPL, P2606, DOI [10.1109/wacv45572.2020.9093640, 10.1109/WACV45572.2020.9093640]
[2]
VQA: Visual Question Answering
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:2425-2433
[3]
Cao YT, 2020, Img Proc Comp Vis Re, V12359, P230, DOI 10.1007/978-3-030-58568-6_14
[4]
RCAA: Relational Context-Aware Agents for Person Search
[J].
COMPUTER VISION - ECCV 2018, PT IX,
2018, 11213
:86-102
[5]
Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association
[J].
COMPUTER VISION - ECCV 2018, PT XVI,
2018, 11220
:56-73
[6]
Chen T, 2020, PR MACH LEARN RES, V119
[7]
Chen YC, 2019, AEBMR ADV ECON, V106, P104, DOI 10.1007/978-3-030-58577-8_7
[8]
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[9]
Ding ZF, 2021, Arxiv, DOI [arXiv:2107.12666, DOI 10.48550/ARXIV.2107.12666]
[10]
Dosovitskiy A., 2021, P INT C LEARN REPR, P1