TURBULENCE: Complexity-Effective Out-of-Order Execution on GPU With Distance-Based ISA

被引：0

作者：

Matsuo, Reoma ^{[1
]}

Koizumi, Toru ^{[1
]}

Irie, Hidetsugu ^{[1
]}

Sakai, Shuichi ^{[1
]}

Shioya, Ryota ^{[1
]}

机构：

[1] Univ Tokyo, Tokyo 1138654, Japan

来源：

IEEE COMPUTER ARCHITECTURE LETTERS | 2024年 / 23卷 / 02期

关键词：

Registers; Out of order; Graphics processing units; Relays; Microarchitecture; Dynamic scheduling; Decoding; Energy efficiency; GPU; instruction-level parallelism; microarchitecture; out-of-order execution;

D O I：

10.1109/LCA.2023.3289317

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

A graphics processing unit (GPU) is a processor that achieves high throughput by exploiting data parallelism. We found that many GPU workloads also contain instruction-level parallelism that can be extracted through out-of-order execution to provide additional performance improvement opportunities. We propose the TURBULENCE architecture for very low-cost out-of-order execution on GPUs. TURBULENCE consists of a novel ISA that introduces the concept of referencing operands by inter-instruction distance instead of register numbers, and a novel microarchitecture that executes the novel ISA. This distance-based operand has the property of not causing false dependencies. By exploiting this property, we achieve cost-effective out-of-order execution on GPUs without introducing expensive hardware such as a rename logic and a load-store queue. Simulation results show that TURBULENCE improves performance by 17.6% without increasing energy consumption over an existing GPU.

引用

页码：175 / 178

页数：4

共 6 条

[1] TURBULENCE: Complexity-effective Out-of-order Execution on GPU with Distance-based ISA
Matsuo, Reoma
Koizumi, Toru
Irie, Hidetsugu
Sakai, Shuichi
Shioya, Ryota
2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
[2] A Complexity-Effective Out-of-Order Retirement Microarchitecture
Petit Marti, Salvador
Sahuquillo Borras, Julio
Lopez Rodriguez, Pedro
Ubal Tena, Rafael
Duato Marin, Jose
IEEE TRANSACTIONS ON COMPUTERS, 2009, 58 (12) : 1626 - 1639
[3] Repurposing GPU Microarchitectures with Light-Weight Out-Of-Order Execution
Iliakis, Konstantinos
Xydis, Sotirios
Soudris, Dimitrios
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (02) : 388 - 402
[4] LOOG: Improving GPU Efficiency With Light-Weight Out-Of-Order Execution
Iliakis, Konstantinos
Xydis, Sotirios
Soudris, Dimitrios
IEEE COMPUTER ARCHITECTURE LETTERS, 2019, 18 (02) : 166 - 169
[5] An effective out-of-order execution control scheme for an embedded floating point coprocessor
Jeong, CH
Park, WC
Han, TD
Yang, SB
Lee, MK
MICROPROCESSORS AND MICROSYSTEMS, 2003, 27 (04) : 171 - 180
[6] Springald: GPU-Accelerated Window-Based Aggregates Over Out-of-Order Data Streams
Mencagli, Gabriele
Dazzi, Patrizio
Coppola, Massimo
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 35 (09) : 1657 - 1671

← 1 →