共 36 条
[1]
Adriaens JT, 2012, INT S HIGH PERF COMP, P79
[2]
Baghsorkhi S. S., 2009, WORKSH EPHAM2009 CON, P1
[4]
An Adaptive Performance Modeling Tool for GPU Architectures
[J].
PPOPP 2010: PROCEEDINGS OF THE 2010 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING,
2010,
:105-114
[5]
Bakhoda A, 2009, INT SYM PERFORM ANAL, P163, DOI 10.1109/ISPASS.2009.4919648
[6]
Beyls K., 2001, Proceedings of the IASTED Conference on Parallel and Distributed Computing and systems, V14, P350
[7]
Accurately modeling the on-chip and off-chip GPU memory subsystem
[J].
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE,
2018, 82
:510-519
[8]
Che SA, 2009, I S WORKL CHAR PROC, P44, DOI 10.1109/IISWC.2009.5306797
[9]
Ocelot: A Dynamic Optimization Framework for Bulk-Synchronous Applications in Heterogeneous Systems
[J].
PACT 2010: PROCEEDINGS OF THE NINETEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES,
2010,
:353-364
[10]
StatStack: Efficient Modeling of LRU caches
[J].
2010 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2010),
2010,
:55-65