An efficient algorithm for out-of-core matrix transposition

被引:16
|
作者
Suh, J
Prasanna, VK
机构
[1] USC, Inst Sci Informat, Arlington, VA 22203 USA
[2] Univ So Calif, Dept Elect Engn Syst, Los Angeles, CA 90089 USA
关键词
matrix transpose; data transfer time; index computation time; I/O time; out-of-core; execution time;
D O I
10.1109/12.995452
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Efficient transposition of Out-of-core matrices has been widely studied. These efforts have focused on reducing the number of I/O operations. However, in state-of-the-art architectures, memory-memory data transfer time and index computation time are also significant components of the overall time. In this paper, we propose an algorithm that considers the index computation time and the I/O time and reduces the overall execution time. Our algorithm reduces the total execution time by reducing the number of I/O operations and eliminating the index computation. In doing so, two techniques are employed: writing the data onto disk in predefined patterns and balancing the number of disk read and write operations. The index computation time, which is an expensive operation involving two divisions and a multiplication, is eliminated by partitioning the memory into read and write buffers, The expensive in-processor permutation is replaced by data collection from the read buffer to the write buffer. Even though this partitioning may increase the number of I/O operations for some cases, it results in an overall reduction in the execution time due to the elimination of the expensive index computation. Our algorithm is analyzed using the well-known Linear Model and the Parallel Disk Model. The experimental results on Sun Enterprise, SGI R12000, and Pentium III show that our algorithm reduces the overall execution time by up to 50 percent compared with the best known algorithms in the literature.
引用
收藏
页码:420 / 438
页数:19
相关论文
共 50 条
  • [1] Efficient Out-of-Core and Out-of-Place Rectangular Matrix Transposition and Rotation
    Godard, Paul
    Loechner, Vincent
    Bastoul, Cedric
    IEEE TRANSACTIONS ON COMPUTERS, 2021, 70 (11) : 1942 - 1948
  • [2] Design of an Efficient Out-of-Core Read Alignment Algorithm
    Konagurthu, Arun S.
    Allison, Lloyd
    Conway, Thomas
    Beresford-Smith, Bryan
    Zobel, Justin
    ALGORITHMS IN BIOINFORMATICS, 2010, 6293 : 189 - 201
  • [3] OCAM: Out-of-core coordinate descent algorithm for matrix completion
    Lee, Dongha
    Oh, Jinoh
    Yu, Hwanjo
    INFORMATION SCIENCES, 2020, 514 (514) : 587 - 604
  • [4] Efficient parallel out-of core matrix transposition
    Krishnamoorthy, S
    Baumgartner, G
    Cociorva, D
    Lam, CC
    Sadayappan, P
    IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, PROCEEDINGS, 2003, : 300 - 307
  • [5] Efficient Out-of-Core Contig Generation
    Prieto Entenza, Julio Omar
    Haeusler, Edward Hermann
    Lifschitz, Sergio
    ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, BSB 2020, 2020, 12558 : 25 - 37
  • [6] An efficient and scalable parallel algorithm for out-of-core isosurface extraction and rendering
    Wang, Qin
    Jaja, Joseph
    Varshney, Amitabh
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2007, 67 (05) : 592 - 603
  • [7] Efficient parallel out-of-core isosurface extraction
    Zhang, HJ
    Newman, TS
    PVG 2003 PROCEEDINGS, 2003, : 9 - 16
  • [8] Efficient view-dependent out-of-core visualization
    Guthe, M
    Borodin, P
    Klein, R
    FOURTH INTERNATIONAL CONFERENCE ON VIRTUAL REALITY AND ITS APPLICATIONS IN INDUSTRY, 2004, 5444 : 428 - 438
  • [9] Efficient methods for out-of-core sparse Cholesky factorization
    Rothberg, E
    Schreiber, R
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1999, 21 (01): : 129 - 144
  • [10] Performance prediction and analysis of parallel out-of-core matrix factorization
    Caron, E
    Lazure, D
    Utard, G
    HIGH PERFORMANCE COMPUTING - HIPC 2000, PROCEEDINGS, 2001, 1970 : 161 - 172