Dynamic Instruction Scheduling in a Trace-based Multi-threaded Architecture

被引:0
作者
Peter A. Rounce
Alberto F. De Souza
机构
[1] University College London,Department of Computer Science
[2] Universidade Federal do Espírito Santo,Departamento de Informática
来源
International Journal of Parallel Programming | 2008年 / 36卷
关键词
Simultaneous multi-threading; Dynamic instruction scheduling; Wide issue architectures; VLIW;
D O I
暂无
中图分类号
学科分类号
摘要
Simulation results are presented using the hardware-implemented, trace-based dynamic instruction scheduler of our single process DTSVLIW architecture to schedule instructions from several processes into multiple streams of VLIW instructions for execution by a wide-issue, simultaneous multi-threading (SMT) execution engine. The scheduling process involves single instruction execution of each process, dynamically scheduling executed instructions into blocks of VLIW instructions cached for subsequent SMT execution: SMT provides a mechanism to reduce the impact of horizontal and vertical waste, and variable memory latencies, seen in the DTSVLIW. Preliminary experiments explore this extended model. Results achieve PE utilization of up to 87% on a 4-thread, 1-scalar, 8 PE design, with speed-ups of up to 6.3 that of a single processor. Noticeably it only needs a single scalar process to be scheduled at any time, with main memory fetches being 1–4% that of a single processor.
引用
收藏
页码:184 / 205
页数:21
相关论文
共 27 条
  • [1] Ungerer T.(2002)Multithreaded processors Comput. J. 45 320-348
  • [2] Robic B.(2000)EPIC: Explicitly parallel instruction processing IEEE Computer 33 37-45
  • [3] Silc J.(2005)High-performance and low-cost dual-thread VLIW processor using weld architectural paradigm IEEE Trans. Parallel Distribut. Syst. 16 1132-1142
  • [4] Schlansker M.(1997)Simultaneous multithreading: a platform for next-generation processors IEEE Micro. 17 12-19
  • [5] Rau B.(2000)Dynamically scheduling VLIW instructions J. Parallel Distribut. Comput. 60 1480-1511
  • [6] Ozer E.(1984)The VLIW machine: a multiprocessor for compiling scientific code IEEE Computer 17 45-53
  • [7] Conte M.(1993)The superblock: an effective technique for VLIW and superscalar compilation J. Supercomput. 7 229-248
  • [8] Eggers S.J.(undefined)undefined undefined undefined undefined-undefined
  • [9] Emer J.S.(undefined)undefined undefined undefined undefined-undefined
  • [10] Levy H.M.(undefined)undefined undefined undefined undefined-undefined