Exploiting procedure level locality to reduce instruction cache misses

被引：0

作者：

Batchu, RV ^{[1
]}

Jiménez, DA ^{[1
]}

机构：

[1] Rutgers State Univ, Dept Comp Sci, Piscataway, NJ 08855 USA

来源：

EIGHTH WORKSHOP ON INTERACTION BETWEEN COMPILERS AND COMPUTER ARCHITECTURES, PROCEEDINGS | 2004年

关键词：

D O I：

10.1109/INTERA.2004.1299512

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

High instruction fetch bandwidth is essential for high performance in today's wide-issue out-of-order processors. Instruction caches must provide a low miss rate as well as low latency. We introduce Procedure Level Relocation, a class of dynamic feedback-directed optimizations that substantially reduce the instruction cache miss rate by exploiting the temporal locality of procedure usage. Based on the observation that half of all procedures executed are at most 128 bytes in length, we present a Small Procedure Cache, a small and fast explicitly managed memory for storing small procedures. We show that Procedure Level Relocation into a Small Procedure Cache reduces the instruction cache miss rate by an average of 15%.

引用

页码：75 / 84

页数：10

共 50 条

[31] Fat Loads: Exploiting Locality Amongst Contemporaneous Load Operations to Optimize Cache Accesses [J].

Baoni, Vanshika ;

Mittal, Adarsh ;

Sohi, Gurindar S. .

PROCEEDINGS OF 54TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO 2021, 2021, :366-379

[32] Effective Data Placement to Reduce Cache Thrashing in Last Level Cache [J].

Ross, William ;

Lee, Byeong Kil .

16TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY-NEW GENERATIONS (ITNG 2019), 2019, 800 :291-296

[33] Combining optimization for cache and instruction-level parallelism [J].

Carr, S .

PROCEEDINGS OF THE 1996 CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT '96), 1996, :238-247

[34] Exploiting Data Locality in Memory for ORAM to Reduce Memory Access Overheads [J].

Kuang, Jinxi ;

Shen, Minghua ;

Lu, Yutong ;

Xiao, Nong .

PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, :703-708

[35] Reducing the performance impact of instruction cache misses by writing instructions into the reservation stations out-of-order [J].

Stark, J ;

Racunas, P ;

Patt, YN .

THIRTIETH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1997, :34-43

[36] Exploiting Locality to Improve Circuit-level Timing Speculation [J].

Xin, Jing ;

Joseph, Russ .

IEEE COMPUTER ARCHITECTURE LETTERS, 2009, 8 (02) :40-43

[37] Temporal-based procedure reordering for improved instruction cache performance [J].

Kalamatianos, J ;

Kaeli, DR .

1998 FOURTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 1998, :244-253

[38] Reducing the second-level cache conflict misses using a set folding technique [J].

Ali Shatnawi ;

Mohammad Alsaedeen .

The Journal of Supercomputing, 2018, 74 :970-993

[39] Reducing the second-level cache conflict misses using a set folding technique [J].

Shatnawi, Ali ;

Alsaedeen, Mohammad .

JOURNAL OF SUPERCOMPUTING, 2018, 74 (02) :970-993

[40] Improving Last Level Cache Locality by Integrating Loop and Data Transformations [J].

Ding, Wei ;

Kandemir, Mahmut .

2012 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2012, :65-72

← 1 2 3 4 5 →