共 47 条
- [12] Ebird: Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services [J]. 2019 IEEE 37TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2019), 2019, : 497 - 505
- [14] eaChristopherOlston KirilGorovoy, 2016, TENSORFLOW SERVING
- [15] Eunyoung Jeong, 2014, Proceedings of NSDI '14: 11th USENIX Symposium on Networked Systems Design and Implementation. NSDI '14, P489
- [16] Fried J, 2020, PROCEEDINGS OF THE 14TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '20), P281
- [17] Low Latency RNN Inference with Cellular Batching [J]. EUROSYS '18: PROCEEDINGS OF THE THIRTEENTH EUROSYS CONFERENCE, 2018,
- [19] Gujarati A, 2020, PROCEEDINGS OF THE 14TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '20), P443
- [20] DeepRecSys: A System for Optimizing End-To-End At-Scale Neural Recommendation Inference [J]. 2020 ACM/IEEE 47TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2020), 2020, : 982 - 995