Efficient parallelization of unstructured reductions on shared memory parallel architectures

被引:0
作者
Benkner, S
Brandes, T
机构
[1] Univ Vienna, Inst Software Technol & Parallel Syst, A-1090 Vienna, Austria
[2] GMD, SCAI, D-53754 St Augustin, Germany
来源
PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS | 2000年 / 1800卷
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper presents a new parallelization method for an efficient implementation of unstructured array reductions on shared memory parallel machines with OpenMP. This method is strongly related to parallelization techniques for irregular reductions on distributed memory machines as employed in the context of High Performance Fortran. By exploiting data locality, synchronization is minimized without introducing severe memory or computational overheads as observed with most existing shared memory parallelization techniques.
引用
收藏
页码:435 / 442
页数:8
相关论文
共 11 条
[1]  
BENKNER S, 1999, LNCS, V1586
[2]   Parallel programming with Polaris [J].
Blume, W ;
Doallo, R ;
Eigenmann, R ;
Grout, J ;
Hoeflinger, J ;
Lawrence, T ;
Lee, J ;
Padua, D ;
Paek, Y ;
Pottenger, B ;
Rauchwerger, L ;
Tu, P .
COMPUTER, 1996, 29 (12) :78-&
[3]  
Brandes T, 1998, LECT NOTES COMPUT SC, V1470, P629, DOI 10.1007/BFb0057910
[4]  
BRANDES T, 1994, PROGRAMMING ENV MASS, P91
[5]   Performance issues of the parallel PAM-CRASH code [J].
Clinckemaillie, J ;
Elsner, B ;
Meliciani, S ;
Vlachoutsis, S ;
deBruyne, F ;
Lonsdale, G ;
Holzner, M .
INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1997, 11 (01) :3-11
[6]  
Gutiérrez E, 1999, LECT NOTES COMPUT SC, V1685, P422
[7]   Maximizing multiprocessor performance with the SUIF compiler [J].
Hall, MW ;
Anderson, JM ;
Amarasinghe, SP ;
Murphy, BR ;
Liao, SW ;
Bugnion, E ;
Lam, MS .
COMPUTER, 1996, 29 (12) :84-&
[8]  
*OP FOR, 1997, OP FORTR APPL PROGR
[9]  
Ponnusamy R., 1993, Proceedings SUPERCOMPUTING '93, P361, DOI 10.1145/169627.169752
[10]  
*RIC U, 1997, HIGH PERF FORT FOR H