Algorithmic skeletons for multi-core, multi-GPU systems and clusters

被引:55
作者
Ernsting, Steffen [1 ]
Kuchen, Herbert [1 ]
机构
[1] Department of Information Systems, University of Muenster, 48149 Muenster
关键词
Algorithmic skeletons; Distributed computing; Distributed memory systems; GPU computing; High performance computing; Message passing; Multiprocessing; Parallel programming; Portable programming; Programming environments; Shared memory systems;
D O I
10.1504/IJHPCN.2012.046370
中图分类号
学科分类号
摘要
Due to the lack of high-level abstractions, developers of parallel applications have to deal with low-level details such as coordinating threads or synchronising processes. Thus, parallel programming still remains a difficult and error-prone task. In order to shield the user from these low-level details, algorithmic skeletons have been proposed. They encapsulate typical parallel programming patterns and have emerged to be an efficient approach to simplifying the development of parallel applications. In this paper, we present our skeleton library Muesli, which not only simplifies parallel programming. Additionally, it allows to write a single application that may be executed on a variety of parallel machines ranging from simple multi-core processors with shared memory to clusters of multi- and many-core processors with distributed memory as well as multi-GPU systems and GPU clusters. The level of platform independence is not reached by other existing approaches, that simplify parallel programming. Internally, the skeletons are based on MPI, OpenMP and CUDA. We demonstrate portability and efficiency of our approach by providing experimental results. Copyright © 2012 Inderscience Enterprises Ltd.
引用
收藏
页码:129 / 138
页数:9
相关论文
共 22 条
[1]  
Aldinucci M., Danelutto M., Dazzi P., Muskel: An expandable skeleton environment, Scalable Computing, 8, 4, pp. 325-341, (2007)
[2]  
Benoit A., Cole M., Gilmore S., Hillston J., Flexible skeletal programming with eSkel, Lecture Notes in Computer Science, 3648, pp. 761-770, (2005)
[3]  
Chapman B., Jost G., Van Der Pas R., Using OpenMP - Portable Shared Memory Parallel Programming, (2008)
[4]  
Ciechanowicz P., Algorithmic skeletons for general sparse matrices on multi-core processors, Proceedings of the 20th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS), pp. 188-197, (2008)
[5]  
Cole M., Algorithmic Skeletons: Structured Management of Parallel Computation, (1989)
[6]  
Danelutto M., Pasqualetti F., Pelagatti S., Lecture Notes in Computer Science, Vol. 1300, Chapter Skeletons for Data Parallelism in p3l, pp. 619-628, (1997)
[7]  
Enmyren J., Kessler C.W., SkePU: A multi-backend skeleton programming library for multi-GPU systems, Proceedings of the Fourth International Workshop on High-level Parallel Programming and Applications, HLPP '10, pp. 5-14, (2010)
[8]  
Gropp W., Lusk W., Skjellum A., Using MPI - Portable Parallel Programming with the Message-Passing Interface, (1996)
[9]  
Hoberock J., Bell N., Thrust: A Parallel Template Library, (2010)
[10]  
Karasawa Y., Iwasaki H., Parallel skeletons for sparse matrices in SkeTo skeleton library, IPSJ Digital Courier, 4, pp. 167-181, (2008)