共 20 条
[1]
Adriaens J.T., Compton K., Kim N.S., Schulte M.J., The case for gpgpu spatial multitasking, IEEE International Symposium on High-Performance Comp Architecture, (2012)
[2]
Amert T., Otterness N., Yang M., Anderson J.H., Smith F.D., Gpu scheduling on the nvidia tx2: Hidden details revealed, 2017 IEEE Real-Time Systems Symposium (RTSS, 2017
[3]
Ausavarungnirun R., Landgraf J., Miller V., Ghose S., Gandhi J., Rossbach C.J., Mutlu O., Mosaic: A gpu memory manager with application-Transparent support for multiple page sizes, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO, 2017
[4]
Awatramani M., Zambreno J., Rover D., Increasing gpu throughput using kernel interleaved thread block scheduling, 2013 IEEE 31st International Conference on Computer Design (ICCD, (2013)
[5]
Belviranli M.E., Khorasani F., Bhuyan L.N., Gupta R., Cumas: Data transfer aware multi-Application scheduling for shared gpus, Proceedings of the 2016 International Conference on Supercomputing, ICS ?16, 2016
[6]
Che S., Boyer M., Meng J., Tarjan D., Sheaer J.W., Lee S., Skadron K., Rodinia: A benchmark suite for heterogeneous computing, 2009 IEEE International Symposium on Workload Characterization (IISWC), (2009)
[7]
Jain P., Mo X., Jain A., Subbaraj H., Durrani R.S., Tumanov A., Gonzalez J., Stoica I., Dynamic Space-Time Scheduling for Gpu Inference, (2018)
[8]
Jia Z., Maggioni M., Smith J., Scarpazza D.P., Dissecting the Nvidia Turing t4 Gpu Via Microbenchmarking, (2019)
[9]
Jia Z., Maggioni M., Staiger B., Scarpazza D.P., Dissecting the Nvidia Volta Gpu Architecture Via Microbenchmarking, (2018)
[10]
Li H., Yu D., Kumar A., Tu Y.-C., Performance modeling in cuda streams-A means for high-Throughput data processing, IEEE International Conference on Big Data, (2014)