Focused Value Prediction

被引：15

作者：

Bandishte, Sumeet ^{[1
]}

Gaur, Jayesh ^{[1
]}

Sperber, Zeev ^{[2
]}

Rappoport, Lihu ^{[2
]}

Yoaz, Adi ^{[2
]}

Subramoney, Sreenivas ^{[1
]}

机构：

[1] Intel Labs, Processor Architecture Res Lab, Bengaluru, India

[2] Intel Corp, Haifa, Israel

来源：

2020 ACM/IEEE 47TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2020) | 2020年

关键词：

D O I：

10.1109/ISCA45697.2020.00018

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Value Prediction was proposed to speculatively break true data dependencies, thereby allowing Out of Order (OOO) processors to achieve higher instruction level parallelism (ILP) and gain performance. State-of-the-art value predictors try to maximize the number of instructions that can be value predicted, with the belief that a higher coverage will unlock more ILP and increase performance. Unfortunately, this comes at increased complexity with implementations that require multiple different types of value predictors working in tandem, incurring substantial area and power cost. In this paper we motivate towards lower coverage, but focused, value prediction. Instead of aggressively increasing the coverage of value prediction, at the cost of higher area and power, we motivate refocusing value prediction as a mechanism to achieve an early execution of instructions that frequently create performance bottlenecks in the OOO processor. Since we do not aim for high coverage, our implementation is light-weight, needing just 1.2 KB of storage. Simulation results on 60 diverse workloads show that we deliver 3.3% performance gain over a baseline similar to the Intel Skylake processor. This performance gain increases substantially to 8.6% when we simulate a futuristic up-scaled version of Skylake. In contrast, for the same storage, state-of-the-art value predictors deliver a much lower speedup of 1.7% and 4.7% respectively. Notably, our proposal is similar to these predictors in performance, even when they are given nearly eight times the storage and have 60% more prediction coverage than our solution.

引用

页码：79 / 91

页数：13

共 27 条

[1]

[Anonymous], 2018, CVP1 2018 1 CHAMP VA

[2]

[Anonymous], 2019, HPCA 19

[3]

[Anonymous], 1996, 1080 TR EE DEP ISR I

[4]

[Anonymous], 2013, ISCA 13, DOI DOI 10.1145/2485922.2485930

[5]

[Anonymous], 1997, ICS 97, DOI DOI 10.1145/263580.263631

[6]

Bekerman M, 1999, CONF PROC INT SYMP C, P54, DOI [10.1109/ISCA.1999.765939, 10.1145/307338.300984]

[7]

Calder B, 1999, CONF PROC INT SYMP C, P64, DOI 10.1145/307338.300985

[8] The Load Slice Core Microarchitecture [J].

Carlson, Trevor E. ;

Heirman, Wim ;

Allam, Osman ;

Kaxiras, Stefanos ;

Eeckhout, Lieven .

2015 ACM/IEEE 42ND ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2015, :272-284

[9] Memory dependence prediction using store sets [J].

Chrysos, GZ ;

Emer, JS .

25TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS, 1998, :142-153

[10] INSIDE 6TH-GENERATION INTEL CORE: NEW MICROARCHITECTURE CODE-NAMED SKYLAKE [J].

Doweck, Jack ;

Kao, Wen-Fu ;

Lu, Allen Kuan-yu ;

Mandelblat, Julius ;

Rahatekar, Anirudha ;

Rappoport, Lihu ;

Rotem, Efraim ;

Yasin, Ahmad ;

Yoaz, Adi .

IEEE MICRO, 2017, 37 (02) :52-62

← 1 2 3 →