共 179 条
[61]
Kitaev N., 2020, Proc. the 8th International Conference on Learning Representations
[62]
Korthikanti VA., 2023, PROC 6 C MACHINE LEA
[63]
Kusumoto M., 2019, Proc. the 33rd International Conference on Neural Information Processing Systems
[64]
Efficient Memory Management for Large Language Model Serving with PagedAttention
[J].
PROCEEDINGS OF THE TWENTY-NINTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, SOSP 2023,
2023,
:611-626
[66]
Lepikhin D, 2020, Arxiv, DOI arXiv:2006.16668
[67]
Lewis Mike, 2020, P 58 ANN M ASS COMP, P7871, DOI [DOI 10.18653/V1/2020.ACL-MAIN.703, 10.18653/v1/2020.acl-main.703]
[69]
Li CL, 2024, AAAI CONF ARTIF INTE, P18490
[70]
Li S., PROC 2021 INT C HIGH