Unrolling shape for out-of-order processors.

被引:0
|
作者
Sato, H [1 ]
机构
[1] Univ Tokyo, Ctr Informat Technol, Tokyo, Japan
来源
INNOVATIVE ARCHITECTURE FOR FUTURE GENERATION HIGH-PERFORMANCE PROCESSORS AND SYSTEMS | 2003年
关键词
D O I
10.1109/IWIA.2003.1262786
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Loop unrolling is today one of the most effective optimizations for modern architectures. To give an analytical model for loop unrolling performance, unrolling shape was proposed. It was applied to in-order processors, and was proved to give an accurate performance model for loop unrolling in term of software pipelining and cache miss alleviation. In this paper, we apply unrolling shape to out-of-order processors. A scheme for calculating PLOOO, pipelining terms of an unrolled loop by factor l are presented as PLOOO(l) = {(Nins(l)/F + NOccpy(l))}/l, where Nins(l) is the number of instructions in an unrolled loop by factor l, F the fetch rate of the architecture, NOccpy(l) the number of store instructions scheduled after Nins(l)/F-th cycle. A pipelining term for in-order processors is essential for calculating NOccpy(l). It is to be noted that the scheme for out-of-order processors uses unrolling shape for in-order processors. Experiments show that our scheme is precise in calculating the behaviour of loop unroling on out-of-order processors. We show that our scheme quantitatively shows the effect of loop unrolling as the one of infinitely unrolled loops on in-order processors. Furthermore, we reveal that the old folklore that the loop unrolling reduces the loop overhead has revived on out-of-order processors as a performance improvement factor as d/dlPL(OOO)(1).
引用
收藏
页码:88 / 97
页数:10
相关论文
共 50 条
  • [21] PMEvo: Portable Inference of Port Mappings for Out-of-Order Processors by Evolutionary Optimization
    Ritter, Fabian
    Hack, Sebastian
    PROCEEDINGS OF THE 41ST ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '20), 2020, : 608 - 622
  • [22] Clockhands: Rename-free Instruction Set Architecture for Out-of-order Processors
    Koizumi, Toru
    Shioya, Ryota
    Sugita, Shu
    Amano, Taichi
    Degawa, Yuya
    Kadomoto, Junichiro
    Irie, Hidetsugu
    Sakai, Shuichi
    56TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO 2023, 2023, : 1 - 16
  • [23] Runahead execution: An alternative to very large instruction windows for out-of-order processors
    Mutlu, O
    Stark, J
    Wilkerson, C
    Patt, YN
    NINTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2003, : 129 - 140
  • [24] Performance of database workloads on shared-memory systems with out-of-order processors
    Ranganathan, P
    Gharachorloo, K
    Adve, SV
    Barroso, LA
    ACM SIGPLAN NOTICES, 1998, 33 (11) : 307 - 318
  • [25] Data-flow prescheduling for large instruction windows in out-of-order processors
    Michaud, P
    Seznec, A
    HPCA: SEVENTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTING ARCHITECTURE, PROCEEDINGS, 2001, : 27 - 36
  • [26] Efficient design space exploration of high performance embedded out-of-order processors
    Eyerman, Spin
    Eeckhout, Lieven
    De Bosschere, Koen
    2006 DESIGN AUTOMATION AND TEST IN EUROPE, VOLS 1-3, PROCEEDINGS, 2006, : 349 - +
  • [27] Delay and Bypass: Ready and Criticality Aware Instruction Scheduling in Out-of-Order Processors
    Alipour, Mehdi
    Kaxiras, Stefanos
    Black-Schaffer, David
    Kumar, Rakesh
    2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020), 2020, : 424 - 434
  • [28] Towards the adoption of Local Branch Predictors in Modern Out-of-Order Superscalar Processors
    Soundararajan, Niranjan
    Gupta, Saurabh
    Natarajan, Ragavendra
    Stark, Jared
    Pal, Rahul
    Sala, Franck
    Rappoport, Lihu
    Yoaz, Adi
    Subramoney, Sreenivas
    MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 519 - 530
  • [30] Reorder buffer structure with shelter buffer for out-of-order issue superscalar processors
    Chang, MS
    Park, CS
    Choi, SB
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2000, E83A (06) : 1091 - 1099