Compiler-Based Timing For Extremely Fine-Grain Preemptive Parallelism

被引:6
作者
Ghosh, Souradip [1 ]
Cuevas, Michael [1 ]
Campanoni, Simone [1 ]
Dinda, Peter [1 ]
机构
[1] Northwestern Univ, Dept Comp Sci, Evanston, IL 60208 USA
来源
PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20) | 2020年
基金
美国国家科学基金会;
关键词
timing; preemptive scheduling; line-granularity parallelism; IMPLEMENTATION;
D O I
10.1109/SC41405.2020.00057
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In current operating system kernels and run-time systems, liming is based on hardware timer interrupts, introducing inherent overheads that limit granularity. For example, the scheduling quantum of preemptive threads is limited, resulting in this abstraction being restricted to coarse-grain parallelism. Compiler-based timing replaces interrupts from the hardware timer with callbacks from compiler-injected code. We describe a system that achieves low-overhead timing using whole-program compiler transformations and optimizations combined with kernel and run-time support. A key novelty is new static analyses that achieve predictable, periodic run-time behavior from the transformed code, regardless of control-flow path. We transform the code of a kernel and run-time system to use compiler-based timing and leverage the resulting fine-grain timing to extend an implementation of fibers (cooperatively scheduled threads), attaining what is effectively preemptive scheduling. The result combines the fine granularity of the cooperative fiber model with the ease of programming of the preemptive thread model.
引用
收藏
页数:15
相关论文
共 56 条
  • [1] ACAR U. A., 2018, 39 ACM SIGPLAN C PRO
  • [2] [Anonymous], 2011, BENCHMARKING MODERN
  • [3] Multigrain Parallelism: Bridging Coarse-Grain Parallel Programs and Fine-Grain Event-Driven Multithreading
    Arteaga, Jaime
    Zuckerman, Stephane
    Gao, Guang R.
    [J]. 2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 799 - 808
  • [4] The Design of OpenMP Tasks
    Ayguade, Eduard
    Copty, Nawal
    Duran, Alejandro
    Hoeflinger, Jay
    Lin, Yuan
    Massaioli, Federico
    Teruel, Xavier
    Unnikrishnan, Priya
    Zhang, Guansong
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2009, 20 (03) : 404 - 418
  • [5] Bauer M, 2012, INT CONF HIGH PERFOR
  • [6] A provable time and space efficient implementation of NESL
    Blelloch, GE
    Greiner, J
    [J]. ACM SIGPLAN NOTICES, 1996, 31 (06) : 213 - 225
  • [7] Cilk: An efficient multithreaded runtime system
    Blumofe, RD
    Joerg, CF
    Kuszmaul, BC
    Leiserson, CE
    Randall, KH
    Zhou, YL
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1996, 37 (01) : 55 - 69
  • [8] Chapman B., 2007, USING OPENMP PORTABL
  • [9] CHEN J., 2005, SIGARCH COMPUT ARCHI, V33, P5463
  • [10] CULLER D. E., 1991, P 4 INT C ARCH SUPP