Performance analysis of the parallel Karatsuba multiplication algorithm for distributed memory architectures

被引:9
作者
Cesari, G
Maeder, R
机构
[1] Inst. for Theor. Computer Science, ETH
[2] Universita degli Studi di Trieste, DEEI
关键词
D O I
10.1006/jsco.1996.0026
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present three parallel implementations of the Karatsuba algorithm for long integer multiplication on a distributed memory architecture and discuss the experimental results obtained on a Paragon computer. The first two implementations have both time complexity O(n) on n(log2 3) processors, but present different behavior for inputs of practical size. The third algorithm has complexity O(n log(2) n) on n processors, offering therefore better asymptotic efficiency. A refinement of the asymptotic analysis for the important case of a constant number of processors takes into account sequential parts of the algorithm and communications overhead. It is shown that the theoretically best speed-up and efficiency can be obtained with two of the algorithms for sufficient problem size. The experimental results confirm the analysis. (C) 1996 Academic Press Limited
引用
收藏
页码:467 / 473
页数:7
相关论文
共 50 条
[41]   PARALLELIZING STRASSENS METHOD FOR MATRIX MULTIPLICATION ON DISTRIBUTED-MEMORY MIMD ARCHITECTURES [J].
CHOU, CC ;
DENG, YF ;
LI, G ;
WANG, Y .
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1995, 30 (02) :49-69
[42]   Strategies for Parallel Execution of Cellular Automata in Distributed Memory Architectures [J].
Giordano, Andrea ;
De Rango, Alessio ;
D'Ambrosio, Donato ;
Rongo, Rocco ;
Spataro, William .
2019 27TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP), 2019, :406-413
[43]   A Karatsuba-Based Algorithm for Polynomial Multiplication in Chebyshev Form [J].
Lima, Juliano B. ;
Panario, Daniel ;
Wang, Qiang .
IEEE TRANSACTIONS ON COMPUTERS, 2010, 59 (06) :835-841
[44]   PARALLEL ALGORITHMS AND ARCHITECTURES FOR MATRIX MULTIPLICATION [J].
PUGLISI, C .
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1989, 17 (12) :1567-1572
[45]   Performance Analysis of OpenMP Scheduling Type on Embarrassingly Parallel Matrix Multiplication Algorithm [J].
Qun, Ng Hui ;
Khalib, Z. I. A. ;
Raof, R. A. A. .
RECENT TRENDS IN INFORMATION AND COMMUNICATION TECHNOLOGY, 2018, 5 :917-925
[46]   An Efficient Pipelined Parallel Join Algorithm on Heterogeneous Distributed Architectures [J].
Hassan, Mohamad Al Hajj ;
Bamha, Mostafa .
SOFTWARE AND DATA TECHNOLOGIES, 2009, 47 :119-133
[47]   Area-Efficient Barrett Modular Multiplication With Optimized Karatsuba Algorithm [J].
Zhang, Bo ;
Yan, Shoumeng .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (12) :4626-4639
[48]   Analysis and performance of a distributed memory multilevel fast multipole algorithm [J].
Velamparambil, S ;
Chew, WC .
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2005, 53 (08) :2719-2727
[49]   Performance analysis of explicit group parallel algorithms for distributed memory multicomputer [J].
Ng, Kok Fu ;
Ali, Norhashidah Hj. Mohd .
PARALLEL COMPUTING, 2008, 34 (6-8) :427-440
[50]   PERFORMANCE ANALYSIS OF DISTRIBUTED-MEMORY COMPUTERS WITH PARALLEL NODE ARCHITECTURE [J].
IANNELLO, G ;
MAZZEO, A ;
MAZZOCCA, N .
JOURNAL OF SYSTEMS AND SOFTWARE, 1995, 29 (02) :107-120