共 50 条
- [21] Portability efficiency approach for calculating performance portability FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2025, 170
- [22] Analyzing CUDA Workloads Using a Detailed GPU Simulator ISPASS 2009: IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, 2009, : 163 - 174
- [23] Exploration of GPU sharing policies under GEMM workloads PROCEEDINGS OF THE 23RD INTERNATIONAL WORKSHOP ON SOFTWARE AND COMPILERS FOR EMBEDDED SYSTEMS (SCOPES 2020), 2020, : 66 - 69
- [24] Understanding of GPU Architectural Vulnerability for Deep Learning Workloads 2019 IEEE INTERNATIONAL SYMPOSIUM ON DEFECT AND FAULT TOLERANCE IN VLSI AND NANOTECHNOLOGY SYSTEMS (DFT), 2019,
- [25] Optimizing Deep Learning Workloads on ARM GPU with TVM 1ST ACM REQUEST WORKSHOP/TOURNAMENT ON REPRODUCIBLE SOFTWARE/HARDWARE CO-DESIGN OF PARETO-EFFICIENT DEEP LEARNING, 2018,
- [26] Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning Workloads SC21: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2021,
- [27] GPU-Initiated Resource Allocation for Irregular Workloads PROCEEDINGS OF 2024 3RD INTERNATIONAL WORKSHOP ON EXTREME HETEROGENEITY SOLUTIONS, EXHET 2024, 2024, : 1 - 8
- [28] Effective Performance Portability PROCEEDINGS OF 2018 IEEE/ACM INTERNATIONAL WORKSHOP ON PERFORMANCE, PORTABILITY AND PRODUCTIVITY IN HPC (P3HPC 2018), 2018, : 24 - 36
- [29] Provision and use of GPU resources for distributed workloads via the Grid 24TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP 2019), 2020, 245
- [30] Characterizing Convolutional Neural Network Workloads on a Detailed GPU Simulator PROCEEDINGS INTERNATIONAL SOC DESIGN CONFERENCE 2017 (ISOCC 2017), 2017, : 84 - 85