Architectural support for task scheduling: hardware scheduling for dataflow on NUMA systems

被引:0
作者
Behram Khan
Daniel Goodman
Salman Khan
Will Toms
Paolo Faraboschi
Mikel Luján
Ian Watson
机构
[1] BT Research,
[2] Solarflare Communications,undefined
[3] The University of Manchester,undefined
[4] HP Labs,undefined
来源
The Journal of Supercomputing | 2015年 / 71卷
关键词
Scheduling; Hardware scheduling; Task-based application; Dataflow;
D O I
暂无
中图分类号
学科分类号
摘要
To harness the compute resource of many-core system with tens to hundreds of cores, applications have to expose parallelism to the hardware. Researchers are aggressively looking for program execution models that make it easier to expose parallelism and use the available resources. One common approach is to decompose a program into parallel ‘tasks’ and allow an underlying system layer to schedule these tasks to different threads. Software-only schedulers can implement various scheduling policies and algorithms that match the characteristics of different applications and programming models. Unfortunately with large-scale multi-core systems, software schedulers suffer significant overheads as they synchronize and communicate task information over deep cache hierarchies. To reduce these overheads, hardware-only schedulers like Carbon have been proposed to enable task queuing and scheduling to be done in hardware. This paper presents a hardware scheduling approach where the structure provided to programs by task-based programming models can be incorporated into the scheduler, making it aware of a task’s data requirements. This prior knowledge of a task’s data requirements allows for better task placement by the scheduler which result in a reduction in overall cache misses and memory traffic, improving the program’s performance and power utilization. Simulations of this technique for a range of synthetic benchmarks and components of real applications have shown a reduction in the number of cache misses by up to 72 and 95 % for the L1 and L2 caches, respectively, and up to 30 % improvement in overall execution time against FIFO scheduling. This results not only in faster execution and in less data transfer with reductions of up to 50 %, allowing for less load on the interconnect, but also in lower power consumption.
引用
收藏
页码:2309 / 2338
页数:29
相关论文
共 50 条
[41]   Task scheduling in cloud-fog computing systems [J].
Guevara, Judy C. ;
da Fonseca, Nelson L. S. .
PEER-TO-PEER NETWORKING AND APPLICATIONS, 2021, 14 (02) :962-977
[42]   A Survey on Task Allocation and Scheduling in Robotic Network Systems [J].
Alirezazadeh, Saeid ;
Alexandre, Luis A. .
IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (02) :1484-1508
[43]   A Novel Task Scheduling Algorithm for Real Time Systems [J].
Kumar, Pankaj ;
Sharma, R. K. .
2013 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2013, :995-998
[44]   Limited Preemption EDF Scheduling of Sporadic Task Systems [J].
Bertogna, Marko ;
Baruah, Sanjoy .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2010, 6 (04) :579-591
[45]   Task scheduling in a finite-resource, reconfigurable hardware/software codesign environment [J].
Loo, Sin Ming ;
Wells, B. Earl .
INFORMS JOURNAL ON COMPUTING, 2006, 18 (02) :151-172
[46]   A Task Scheduling Problem in Mobile Robot Fulfillment Systems [J].
Yuan, Wei ;
Sun, Hui .
2020 12TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2020, :391-396
[47]   Scheduling task in-trees on distributed memory systems [J].
Baskiyar, S .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2001, E84D (06) :685-691
[48]   Task scheduling in cloud-fog computing systems [J].
Judy C. Guevara ;
Nelson L. S. da Fonseca .
Peer-to-Peer Networking and Applications, 2021, 14 :962-977
[49]   A BMC-based formulation for the scheduling problem of hardware systems [J].
Cabodi G. ;
Kondratyev A. ;
Lavagno L. ;
Nocco S. ;
Quer S. ;
Watanabe Y. .
International Journal on Software Tools for Technology Transfer, 2005, 7 (2) :102-117
[50]   K%-Fair Scheduling: A Flexible Task Scheduling Strategy for Balancing Fairness and Efficiency in MapReduce Systems [J].
Zhao, Hui ;
Yang, Shuqiang ;
Chen, Zhikun ;
Fan, Hua ;
Xu, Jinghu .
PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, :629-633