Classifying Memory Access Patterns for Prefetching

被引:54
作者
Ayers, Grant [1 ,3 ]
Litz, Heiner [2 ,3 ]
Kozyrakis, Christos [1 ,3 ]
Ranganathan, Parthasarathy [3 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] UC Santa Cruz, Santa Cruz, CA USA
[3] Google, Mountain View, CA 94043 USA
来源
TWENTY-FIFTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXV) | 2020年
关键词
D O I
10.1145/3373376.3378498
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Prefetching is a well-studied technique for addressing the memory access stall time of contemporary microprocessors. However, despite a large body of related work, the memory access behavior of applications is not well understood, and it remains difficult to predict whether a particular application will benefit from a given prefetcher technique. In this work we propose a novel methodology to classify the memory access patterns of applications, enabling well-informed reasoning about the applicability of a certain prefetcher. Our approach leverages instruction dataflow information to uncover a wide range of access patterns, including arbitrary combinations of offsets and indirection. These combinations- or prefetch kernels-represent reuse, strides, reference locality, and complex address generation. By determining the complexity and frequency of these access patterns, we enable reasoning about prefetcher timeliness and criticality, exposing the limitations of existing prefetchers today. Moreover, using these kernels, we are able to compute the next address for the majority of top-missing instructions, and we propose a software prefetch injection methodology that is able to outperform state-of-the-art hardware prefetchers.
引用
收藏
页码:513 / 526
页数:14
相关论文
共 54 条
[1]  
Al-Sukhni H, 2003, 12TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PROCEEDINGS, P91
[2]  
Annavaram M, 2001, CONF PROC INT SYMP C, P52, DOI 10.1109/ISCA.2001.937432
[3]  
[Anonymous], 2011, P 38 ANN INT S COMP
[4]   Self-Contained, Accurate Precomputation Prefetching [J].
Atta, Islam ;
Tong, Xin ;
Srinivasan, Vijayalakshmi ;
Baldini, Loana ;
Moshovos, Andreas .
PROCEEDINGS OF THE 48TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO-48), 2015, :153-165
[5]  
Ayers G., 2018, HIGH PERFORMANCE COM
[6]   AsmDB: Understanding and Mitigating Front-End Stalls in Warehouse-Scale Computers [J].
Ayers, Grant ;
Nagendra, Nayana Prasad ;
August, David, I ;
Cho, Hyoun Kyu ;
Kanev, Svilen ;
Kozyrakis, Christos ;
Krishnamurthy, Trivikram ;
Litz, Heiner ;
Moseley, Tipp ;
Ranganathan, Parthasarathy .
PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19), 2019, :462-473
[7]   The PARSEC Benchmark Suite: Characterization and Architectural Implications [J].
Bienia, Christian ;
Kumar, Sanjeev ;
Singh, Jaswinder Pal ;
Li, Kai .
PACT'08: PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2008, :72-81
[8]  
Bruening Derek, 2003, INT S COD GEN OPT 20
[9]  
CALLAHAN D, 1991, SIGPLAN NOTICES, V26, P40, DOI 10.1145/106973.106979
[10]  
Chen Dehao, 2016, CODE GENERATION OPTI