Cross-Architecture Performance Prediction (XAPP) Using CPU Code to Predict GPU Performance

被引：68

作者：

Ardalani, Newsha ^{[1
]}

Lestourgeon, Clint ^{[1
]}

Sankaralingam, Karthikeyan ^{[1
]}

Zhu, Xiaojin ^{[1
]}

机构：

[1] Univ Wisconsin Madison, Madison, WI 53706 USA

来源：

PROCEEDINGS OF THE 48TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO-48) | 2015年

关键词：

GPU; Cross-platform Prediction; Performance Modeling; Machine Learning; REGRESSION-MODELS; DESIGN SPACE; HARDWARE; BENCHMARKS;

D O I：

10.1145/2830772.2830780

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

GPUs have become prevalent and more general purpose, but GPU programming remains challenging and time consuming for the majority of programmers. In addition, it is not always clear which codes will benefit from getting ported to GPU. Therefore, having a tool to estimate GPU performance for a piece of code before writing a GPU implementation is highly desirable. To this end, we propose Cross-Architecture Performance Prediction (XAPP), a machine-learning based technique that uses only single-threaded CPU implementation to predict GPU performance. Our paper is built on the two following insights: i) Execution time on GPU is a function of program properties and hardware characteristics. ii) By examining a vast array of previously implemented GPU codes along with their CPU counterparts, we can use established machine learning techniques to learn this correlation between program properties, hardware characteristics and GPU execution time. We use an adaptive two-level machine learning solution. Our results show that our tool is robust and accurate: we achieve 26.9% average error on a set of 24 real-world kernels. We also discuss practical usage scenarios for XAPP.

引用

页码：725 / 737

页数：13

共 44 条

[1]

Ali KM, 1996, MACH LEARN, V24, P173, DOI 10.1007/BF00058611

[2]

[Anonymous], 1998, INTELL DATA ANAL, DOI DOI 10.1016/S1088-467X(98)00023-7

[3]

[Anonymous], 2012, CTR RELIABLE HIGH PE

[4] THE NAS PARALLEL BENCHMARKS [J].

BAILEY, DH ;

BARSZCZ, E ;

BARTON, JT ;

BROWNING, DS ;

CARTER, RL ;

DAGUM, L ;

FATOOHI, RA ;

FREDERICKSON, PO ;

LASINSKI, TA ;

SCHREIBER, RS ;

SIMON, HD ;

VENKATAKRISHNAN, V ;

WEERATUNGA, SK .

INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1991, 5 (03) :63-73

[5]

Baldini Ioana, 2014, 2014 IEEE 26th International Symposium on Computer Architecture and High-Performance Computing (SBAC-PAD), P254, DOI 10.1109/SBAC-PAD.2014.30

[6] An empirical comparison of voting classification algorithms: Bagging, boosting, and variants [J].

Bauer, E ;

Kohavi, R .

MACHINE LEARNING, 1999, 36 (1-2) :105-139

[7]

Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1023/A:1018054314350

[8]

Che S., IISWC 09

[9] Ensemble methods in machine learning [J].

Dietterich, TG .

MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 :1-15

[10]

Ferri C, 2002, LECT NOTES COMPUT SC, V2534, P165

← 1 2 3 4 5 →