共 23 条
[1]
Dagum L(2002)OpenMP: an industry standard API for shared-memory programming IEEE Comput. Sci. Eng. 5 46-55
[2]
Menon R(2008)A study of process arrival patterns for MPI collective operations Int. J. Parallel Prog. 36 543-570
[3]
Faraj A(1991)Dataflow analysis of scalar and array references Int. J. Parallel Prog. 20 23-53
[4]
Patarasuk P(2010)Towards performance portability through runtime adaption for high performance computing applications Concurr. Comput. Pract. Exp. 22 2230-2246
[5]
Yuan X(2011)High-performance and scalable non-blocking all-to-all with collective offload on infiniband clusters: a study with parallel 3d fft Comput. Sci. Res. Dev. 26 237-246
[6]
Feautrier P(2011)Auto-tuning full applications: a case study Int. J. High Perform. Comput. Appl. 25 286-294
[7]
Gabriel E(undefined)undefined undefined undefined undefined-undefined
[8]
Feki S(undefined)undefined undefined undefined undefined-undefined
[9]
Benkert K(undefined)undefined undefined undefined undefined-undefined
[10]
Resch MM(undefined)undefined undefined undefined undefined-undefined