Performance analysis of the parallel Karatsuba multiplication algorithm for distributed memory architectures

被引:9
|
作者
Cesari, G
Maeder, R
机构
[1] Inst. for Theor. Computer Science, ETH
[2] Universita degli Studi di Trieste, DEEI
关键词
D O I
10.1006/jsco.1996.0026
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present three parallel implementations of the Karatsuba algorithm for long integer multiplication on a distributed memory architecture and discuss the experimental results obtained on a Paragon computer. The first two implementations have both time complexity O(n) on n(log2 3) processors, but present different behavior for inputs of practical size. The third algorithm has complexity O(n log(2) n) on n processors, offering therefore better asymptotic efficiency. A refinement of the asymptotic analysis for the important case of a constant number of processors takes into account sequential parts of the algorithm and communications overhead. It is shown that the theoretically best speed-up and efficiency can be obtained with two of the algorithms for sufficient problem size. The experimental results confirm the analysis. (C) 1996 Academic Press Limited
引用
收藏
页码:467 / 473
页数:7
相关论文
共 50 条
  • [1] Performance Analysis of Karatsuba Multiplication Algorithm for Different Bit Lengths
    Eyupoglu, Can
    WORLD CONFERENCE ON TECHNOLOGY, INNOVATION AND ENTREPRENEURSHIP, 2015, : 1860 - 1864
  • [2] On karatsuba multiplication algorithm
    Fang, Xianjin
    Li, Longshu
    PROCEEDINGS OF THE FIRST INTERNATIONAL SYMPOSIUM ON DATA, PRIVACY, AND E-COMMERCE, 2007, : 274 - 276
  • [3] A parallel implementation of the nonsymmetric QR algorithm for distributed memory architectures
    Henry, G
    Watkins, D
    Dongarra, J
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2002, 24 (01): : 284 - 311
  • [4] On the Performance of Parallel Neural Network Implementations on Distributed Memory Architectures
    Ganeshamoorthy, K.
    Ranasinghe, D. N.
    CCGRID 2008: EIGHTH IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, VOLS 1 AND 2, PROCEEDINGS, 2008, : 90 - 97
  • [5] Parallel performance prediction for multigrid codes on distributed memory architectures
    Romanazzi, Giuseppe
    Jimack, Peter K.
    HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, PROCEEDINGS, 2007, 4782 : 647 - 658
  • [6] New parallel matrix multiplication algorithm on distributed-memory concurrent computers
    Choi, Jaeyoung
    Proceedings of the Conference on High Performance Computing on the Information Superhighway, HPC Asia'97, 1997, : 224 - 229
  • [7] A new parallel matrix multiplication algorithm on distributed-memory concurrent computers
    Choi, J
    HIGH PERFORMANCE COMPUTING ON THE INFORMATION SUPERHIGHWAY - HPC ASIA '97, PROCEEDINGS, 1997, : 224 - 229
  • [8] Design and analysis of Single Precision Floating Point Multiplication using Karatsuba Algorithm and Parallel Prefix Adders
    Gowreesrinivas, K. V.
    Samundiswary, P.
    2017 FOURTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATION AND NETWORKING (ICSCN), 2017,
  • [9] A new parallel matrix multiplication algorithm on distributed-memory concurrent computers
    Choi, J
    CONCURRENCY-PRACTICE AND EXPERIENCE, 1998, 10 (08): : 655 - 670
  • [10] Performance analysis of distributed symmetric sparse matrix vector multiplication algorithm for multi-core architectures
    Oryspayev, Dossay
    Aktulga, Hasan Metin
    Sosonkina, Masha
    Maris, Pieter
    Vary, James P.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (17): : 5019 - 5036