共 42 条
[1]
Bao H., 2022, Advances in Neural Information Processing Systems, V35, P32897, DOI DOI 10.1109/CVPR.2018.00636
[2]
Bugliarello Emanuele, 2022, P MACHINE LEARN ING, V162, P2370
[3]
Carlsson F, 2022, LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P6848
[4]
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:3557-3567
[5]
Chen Xinlei, 2015, CORR
[6]
UNITER: UNiversal Image-TExt Representation Learning
[J].
COMPUTER VISION - ECCV 2020, PT XXX,
2020, 12375
:104-120
[7]
Chi ZW, 2021, 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), P3576
[8]
Conneau A, 2019, ADV NEUR IN, V32
[9]
Conneau Alexis, 2020, P 58 ANN M ASS COMPU, P8440, DOI DOI 10.18653/V1/2020.ACL-MAIN.747
[10]
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171