共 44 条
- [1] Optimizing Dynamic Programming on Graphics Processing Units via Data Reuse and Data Prefetch with Inter-Block Barrier Synchronization PROCEEDINGS OF THE 2012 IEEE 18TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2012), 2012, : 45 - 52
- [3] Virtual Thread: Maximizing Thread-Level Parallelism beyond GPU Scheduling Limit 2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 609 - 621
- [5] An Analytical Model for a GPU Architecture with Memory-level and Thread-level Parallelism Awareness ISCA 2009: 36TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, 2009, : 152 - 163
- [10] Improving ODE Integration on Graphics Processing Units by Reducing Thread Divergence COMPUTATIONAL SCIENCE - ICCS 2019, PT III, 2019, 11538 : 450 - 456