Profile-Guided Deployment of Stream Programs on Multicores

被引:7
作者
Farhad, S. M. [1 ]
Ko, Yousun [2 ]
Burgstaller, Bernd [2 ]
Scholz, Bernhard [1 ]
机构
[1] Univ Sydney, Sydney, NSW 2006, Australia
[2] Yonsei Univ, Seoul 120749, South Korea
基金
澳大利亚研究理事会; 新加坡国家研究基金会;
关键词
Languages; Algorithms; Profiling; Performance; Optimization; StreamIt; multicore; stream programming;
D O I
10.1145/2345141.2248430
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Because multicore architectures have become the industry standard, programming abstractions for concurrent programming are of key importance. Stream programming languages facilitate application domains characterized by regular sequences of data, such as multimedia, graphics, signal processing and networking. With stream programs, computations are expressed through independent actors that interact through FIFO data channels. A major challenge with stream programs is to load-balance actors among available processing cores. The workload of a stream program is determined by actor execution times and the communication overhead induced by data channels. Estimating communication costs on cache-coherent shared-memory multiprocessors is difficult, because data movements are abstracted away by the cache coherence protocol. Standard execution time profiling techniques cannot separate actor execution times from communication costs, because communication costs manifest in terms of execution time overhead. In this work we present a unified Integer Linear Programming (ILP) formulation that balances the workload of stream programs on cache-coherent multicore architectures. For estimating the communication costs of data channels, we devise a novel profiling scheme that minimizes the number of profiling steps. We conduct experiments across a range of StreamIt benchmarks and show that our method achieves a speedup of up to 4.02x on 6 processors. The number of profiling steps is on average only 17% of an exhaustive profiling run over all data channels of a stream program.
引用
收藏
页码:79 / 88
页数:10
相关论文
共 35 条
[1]  
Amarasinghe S., 2010, STREAMIT
[2]  
[Anonymous], 2001, Approximation algorithms
[3]  
[Anonymous], 2003, AMPL: A Modeling Language for Mathematical Programming
[4]  
[Anonymous], P INT C COMP ARCH SY
[5]  
Backus J., 2007, ACM TURING AWARD LEC
[6]  
Battacharyya Shuvra S., 1996, Software Synthesis from Dataflow Graphs
[7]   On colouring the nodes of a network [J].
Brooks, RL .
PROCEEDINGS OF THE CAMBRIDGE PHILOSOPHICAL SOCIETY, 1941, 37 :194-197
[8]  
Bryant Randal E, 2003, COMPUTER SYSTEMS PRO
[9]   Brook for GPUs: Stream computing on graphics hardware [J].
Buck, I ;
Foley, T ;
Horn, D ;
Sugerman, J ;
Fatahalian, K ;
Houston, M ;
Hanrahan, P .
ACM TRANSACTIONS ON GRAPHICS, 2004, 23 (03) :777-786
[10]  
Chen M. K., 2005, PLDI 05