A Large-Scale Speculation for the Thread-Level Parallelization

被引:1
作者
Shoji, Yuki [1 ]
Nunome, Atsushi
Hirata, Hiroaki
Shibayama, Kiyoshi
机构
[1] Kyoto Inst Technol, Grad Sch Sci & Technol, Sakyo Ku, Kyoto 6068585, Japan
来源
3RD INTERNATIONAL CONFERENCE ON APPLIED COMPUTING AND INFORMATION TECHNOLOGY (ACIT 2015) 2ND INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND INTELLIGENCE (CSI 2015) | 2015年
关键词
thread-level parallelization; speculative execution; shared-memory multiprocessor; parallel architecture; cache coherency; memory renaming; value prediction;
D O I
10.1109/ACIT-CSI.2015.39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We have been developing a multiprocessor architecture which creates speculative threads from a sequential program and executes them in parallel. In this architecture, we aim at the large-scale speculation which supports the execution of speculative threads of arbitrary size and duration. So, our system must be able to analyze the dependency on large amounts of memory data. In this paper, we describe the outline of the current design of our architecture and the mechanism for dynamic inter-thread dependency analysis, memory renaming, and speculative data management in detail. These mechanisms not only enables the large amount of speculative data to be maintained, but also reduces speculation overheads. We also investigate how much dependency between coarse-grain threads there are found in practical application programs and estimate the possibility of the parallelization with our speculation architecture. Our memory renaming mechanism can remove most of hazards due to the dependencies, and value prediction is promising to remove the large part of the remain.
引用
收藏
页码:162 / 168
页数:7
相关论文
共 13 条
  • [1] [Anonymous], 2005, PROGRAMMING ENV MANU
  • [2] [Anonymous], 2006, SPEC CPU 2006
  • [3] ARB: A hardware mechanism for dynamic reordering of memory references
    Franklin, M
    Sohi, GS
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 1996, 45 (05) : 552 - 571
  • [4] Fuji K., 2012, P FOR INF TECHN, V1, P267
  • [5] The Stanford Hydra CMP
    Hammond, L
    Hubbert, BA
    Siu, M
    Prabhu, MK
    Chen, M
    Olukotun, K
    [J]. IEEE MICRO, 2000, 20 (02) : 71 - 84
  • [6] Hertzberg B, 2011, PROCEED CGO, P64, DOI 10.1109/CGO.2011.5764675
  • [7] Multigrain parallel processing on OSCAR CMP
    Kimura, K
    Kodaka, T
    Obata, M
    Kasahara, H
    [J]. INNOVATIVE ARCHITECTURE FOR FUTURE GENERATION HIGH-PERFORMANCE PROCESSORS AND SYSTEMS, 2003, : 56 - 65
  • [8] Value locality and load value prediction
    Lipasti, MH
    Wilkerson, CB
    Shen, JP
    [J]. ACM SIGPLAN NOTICES, 1996, 31 (09) : 138 - 147
  • [9] Morita K., 2010, P FOR INF TECHN, V1, P81
  • [10] Ohsawa T, 2005, INT SYMP MICROARCH, P81