Performance Optimization for CPU-GPU Heterogeneous Parallel System

被引：0

作者：

Wang, Yanhua ^{[1
]}

Qiao, Jianzhong ^{[1
]}

Lin, Shukuan ^{[1
]}

Zhao, Tinglei ^{[1
]}

机构：

[1] Northeastern Univ, Coll Comp Sci & Engn, Shenyang, Peoples R China

来源：

2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD) | 2016年

关键词：

GPU; SVM; task pre-treating; task allocation; heterogeneous system;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

With GPU (Graphics Processing Unit) taking part in general-purpose computing, a heterogeneous system usually achieves higher performance and efficiency. There are many studies on how to improve the performance of a heterogeneous system, among of which are a number of researches to achieve the goal by allocating workload into processors with different strategies. In the paper, we implement a task allocation model in the principle of making execution time of the partition on CPU closer to the partition on GPU to the maximum extent. The task allocation process contains two stages. Firstly, we make use of SVM (Support Vector Machine) to classify the tasks into two sets as CPU-kind and GPU-kind in pre-treating stage. Secondly, we adjust the two task sets in the light of the characteristic and current running status of processors, then we map the two well-adjusted task sets to processors. Moreover, we evaluate the proposed model by implementing them on a real heterogeneous system and several benchmarks. Experimental results demonstrate that our model can achieve up to 23.43% of performance improvement compared to some states of the art allocation strategies averagely.

引用

页码：1259 / 1266

页数：8

共 16 条

[1]

[Anonymous], P C HIGH PERF COMP N

[2]

Chi-Keung Luk, 2009, Proceedings of the 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2009), P45

[3]

Gregg C, 2011, INT SYM PERFORM ANAL, P134, DOI 10.1109/ISPASS.2011.5762730

[4]

Grewe D, 2011, LECT NOTES COMPUT SC, V6601, P286, DOI 10.1007/978-3-642-19861-8_16

[5]

Hong S, 2009, CONF PROC INT SYMP C, P152, DOI 10.1145/1555815.1555775

[6]

Kai Ma, 2012, 2012 41st International Conference on Parallel Processing (ICPP 2012), P48, DOI 10.1109/ICPP.2012.31

[7]

Khatter H, 2014, PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON ISSUES AND CHALLENGES IN INTELLIGENT COMPUTING TECHNIQUES (ICICT), P159, DOI 10.1109/ICICICT.2014.6781271

[8] NVIDIA Tesla: A unified graphics and computing architecture [J].

Lindholm, Erik ;

Nickolls, John ;

Oberman, Stuart ;

Montrym, John .

IEEE MICRO, 2008, 28 (02) :39-55

[9]

Nickolls John, 2008, ACM Queue, V6, DOI 10.1145/1365490.1365500

[10] THE GPU COMPUTING ERA [J].

Nickolls, John ;

Dally, William J. .

IEEE MICRO, 2010, 30 (02) :56-69

← 1 2 →