Hybrid CPU-GPU scheduling and execution of tree traversals

被引:1
作者
Liu, Jianqiao [1 ]
Hegde, Nikhil [1 ]
Kulkarni, Milind [1 ]
机构
[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
关键词
Heterogeneous architectures; Scheduling; Irregular applications; Tree traversal;
D O I
10.1145/2851141.2851174
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
GPUs offer the promise of massive, power-efficient parallelism. However, exploiting this parallelism requires code to be carefully structured to deal with the limitations of the SIMT execution model. In recent years, there has been much interest in mapping irregular applications to GPUs: applications with unpredictable, data-dependent behaviors. While most of the work in this space has focused on ad hoc implementations of specific algorithms, recent work has looked at generic techniques for mapping a large class of tree traversal algorithms to GPUs, through careful restructuring of the tree traversal algorithms to make them behave more regularly. Unfortunately, even this general approach for GPU execution of tree traversal algorithms is reliant on ad hoc, handwritten, algorithm-specific scheduling (i.e., assignment of threads to warps) to achieve high performance. The key challenge of scheduling is that it is a highly irregular process, that requires the inspection of thread behavior and then careful sorting of the threads into warps. In this paper, we present a novel scheduling and execution technique for tree traversal algorithms that is both general and automatic. The key novelty is a hybrid approach: the GPU partially executes tasks to inspect thread behavior and transmits information back to the CPU, which uses that information to perform the scheduling itself, before executing the remaining, carefully scheduled, portion of the traversals on the GPU. We applied this framework to five tree traversal algorithms, achieving significant speedups over optimized GPU code that does not perform application-specific scheduling. Further, we show that in many cases, our hybrid approach is able to deliver better performance even than GPU code that uses hand-tuned, applicationspecific scheduling.
引用
收藏
页码:385 / 386
页数:2
相关论文
共 38 条
  • [31] A parallel local search in CPU/GPU for scheduling independent tasks on large heterogeneous computing systems
    Iturriaga, Santiago
    Nesmachnow, Sergio
    Luna, Francisco
    Alba, Enrique
    JOURNAL OF SUPERCOMPUTING, 2015, 71 (02) : 648 - 672
  • [32] A parallel local search in CPU/GPU for scheduling independent tasks on large heterogeneous computing systems
    Santiago Iturriaga
    Sergio Nesmachnow
    Francisco Luna
    Enrique Alba
    The Journal of Supercomputing, 2015, 71 : 648 - 672
  • [33] RAISE: Efficient GPU Resource Management via Hybrid Scheduling
    Weng, Yue
    Ge, Tianao
    Zhang, Xi
    Zhang, Xianwei
    Lu, Yutong
    2022 22ND IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2022), 2022, : 685 - 694
  • [34] GPU-Based Hybrid Cellular Genetic Algorithm for Job-Shop Scheduling Problem
    Amrane, Abdelkader
    Debbat, Fatima
    Yahyaoui, Khadidja
    INTERNATIONAL JOURNAL OF APPLIED METAHEURISTIC COMPUTING, 2021, 12 (02) : 1 - 15
  • [35] Hybrid Monte Carlo tree search based multi-objective scheduling
    Hofmann, Constantin
    Liu, Xinhong
    May, Marvin
    Lanza, Gisela
    PRODUCTION ENGINEERING-RESEARCH AND DEVELOPMENT, 2023, 17 (01): : 133 - 144
  • [36] Hybrid Monte Carlo tree search based multi-objective scheduling
    Constantin Hofmann
    Xinhong Liu
    Marvin May
    Gisela Lanza
    Production Engineering, 2023, 17 : 133 - 144
  • [37] Optimal One-Wafer Cyclic Scheduling of Hybrid Multirobot Cluster Tools With Tree Topology
    Yang, FaJun
    Wu, NaiQi
    Qiao, Yan
    Zhou, MengChu
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2018, 48 (02): : 289 - 298
  • [38] Task Scheduling for GPU Accelerated Hybrid OLAP Systems with Multi-core Support and Text-to-Integer Translation
    Malik, Maria
    Riha, Lubomir
    Shea, Colin
    El-Ghazawi, Tarek
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 1987 - 1996