共 92 条
- [11] Chiang W.-L., 2023, Vicuna: An open-source chatbot impressing gpt-4 with 90
- [12] LazyBatching: An SLA-aware Batching System for Cloud Machine Learning Inference [J]. 2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021), 2021, : 493 - 506
- [13] Choquette J., 2020, 2020 IEEE HOT CHIPS, P1
- [14] Dao T., 2022, NeurIPS
- [15] Dao T, 2023, Arxiv, DOI [arXiv:2307.08691, DOI 10.48550/ARXIV.2307.08691]
- [16] DeepSpeed, 2023, Zero documentation
- [17] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
- [18] Du ZX, 2022, Arxiv, DOI arXiv:2103.10360
- [19] Gorman Mel, 2006, OTT LIN S CIT, V1, P369
- [20] Guan Y, 2022, PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), P7275