共 77 条
[31]
Li Q, 2021, Arxiv, DOI arXiv:2008.00364
[33]
Lin W., 2021, arXiv
[34]
VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:7001-7011
[35]
Litman D, 2016, AAAI CONF ARTIF INTE, P4170
[36]
Liu JC, 2021, Arxiv, DOI arXiv:2101.06804
[38]
OAG-BERT: Towards A Unified Backbone Language Model For Academic Knowledge Services
[J].
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022,
2022,
:3418-3428
[39]
Ziegler DM, 2020, Arxiv, DOI [arXiv:1909.08593, 10.48550/arXiv.1909.08593]
[40]
Madotto A, 2020, Arxiv, DOI arXiv:2008.06239