Performance analysis of the parallel Karatsuba multiplication algorithm for distributed memory architectures

被引:9
作者
Cesari, G
Maeder, R
机构
[1] Inst. for Theor. Computer Science, ETH
[2] Universita degli Studi di Trieste, DEEI
关键词
D O I
10.1006/jsco.1996.0026
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present three parallel implementations of the Karatsuba algorithm for long integer multiplication on a distributed memory architecture and discuss the experimental results obtained on a Paragon computer. The first two implementations have both time complexity O(n) on n(log2 3) processors, but present different behavior for inputs of practical size. The third algorithm has complexity O(n log(2) n) on n processors, offering therefore better asymptotic efficiency. A refinement of the asymptotic analysis for the important case of a constant number of processors takes into account sequential parts of the algorithm and communications overhead. It is shown that the theoretically best speed-up and efficiency can be obtained with two of the algorithms for sufficient problem size. The experimental results confirm the analysis. (C) 1996 Academic Press Limited
引用
收藏
页码:467 / 473
页数:7
相关论文
共 50 条
[21]   PERFORMANCE MODELING OF DISTRIBUTED MEMORY ARCHITECTURES [J].
JOHNSSON, SL .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1991, 12 (04) :300-312
[22]   Visualising algorithm performance on parallel architectures [J].
Pears, AN ;
Shaker, K .
PROCEEDINGS OF THE HIGH-PERFORMANCE COMPUTING (HPC'98), 1998, :305-310
[23]   Parallel Genehunter: implementation of a linkage analysis package for distributed-memory architectures [J].
Conant, GC ;
Plimpton, SJ ;
Old, W ;
Wagner, A ;
Fain, PR ;
Pacheco, TR ;
Heffelfinger, G .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2003, 63 (7-8) :674-682
[24]   Task clustering and scheduling for distributed memory parallel architectures [J].
Palis, MA ;
Liou, JC ;
Wei, DSL .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1996, 7 (01) :46-55
[25]   Fast parallel particle simulations on distributed memory architectures [J].
Hipp, M ;
Kunze, S ;
Ritt, M ;
Rosenstiel, W ;
Ruder, H .
HIGH PERFORMANCE COMPUTING IN SCIENCE AND ENGINEERING 01, 2002, :485-499
[26]   A Framework for Parallel Genetic Algorithms for Distributed Memory Architectures [J].
Georgiev, Dobromir ;
Atanassov, Emanouil ;
Alexandrov, Vassil .
2014 5TH WORKSHOP ON LATEST ADVANCES IN SCALABLE ALGORITHMS FOR LARGE-SCALE SYSTEMS (SCALA), 2014, :47-53
[27]   INDEXED, GLOBAL OBJECTS FOR DISTRIBUTED MEMORY PARALLEL ARCHITECTURES [J].
BAIN, WL .
SIGPLAN NOTICES, 1989, 24 (04) :95-98
[28]   Towards structured parallel computing on architecture-independent parallel algorithm design for distributed-memory architectures [J].
Gao, F .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1996, 53 (01) :112-128
[29]   A novel parallel algorithm for large-scale Fock matrix construction with small locally distributed memory architectures:: RT parallel algorithm [J].
Takashima, H ;
Yamada, S ;
Obara, S ;
Kitamura, K ;
Inabata, S ;
Miyakawa, N ;
Tanabe, K ;
Nagashima, U .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2002, 23 (14) :1337-1346
[30]   Performance pounds for distributed memory multithreaded architectures [J].
Zuberek, WM ;
Govindarajan, R .
1998 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5, 1998, :232-237