Hardware-software optimizations of reconfigurable multi-core processors for floating-point computations of large sparse matrices

被引:1
作者
Wang, Xiaofang [1 ]
机构
[1] Villanova Univ, Dept Elect & Comp Engn, Villanova, PA 19085 USA
关键词
FPGA; Multi-core processor on a programmable chip; Parallel LU factorization; Hardware customization; Dynamic scheduling; FPGA; SYSTEMS; OPERATIONS;
D O I
10.1007/s11554-012-0277-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
State-of-the-art field-programmable gate array (FPGA) technologies have provided exciting opportunities to develop more flexible, less expensive, and better performance floating-point computing platforms for embedded systems. To better harness the full power of FPGAs and to bring FPGAs to more system designers, we investigate unique advantages and optimization opportunities in both software and hardware offered by multi-core processors on a programmable chip (MPoPCs). In this paper, we present our hardware customization and software dynamic scheduling solutions for LU factorization of large sparse matrices on in-house developed MPoPCs. Theoretical analysis is provided to guide the design. Implementation results on an Altera Stratix III FPGA for five benchmark matrices of size up to 7,917 x 7,917 are presented. Our hardware customization alone can reduce the execution time by up to 17.22 %. The integrated hardware-software optimization improves the speedup by an average of 60.30 %.
引用
收藏
页码:187 / 204
页数:18
相关论文
共 64 条
[21]   Xtensa with user defined DSP coprocessor microarchitectures [J].
Ezer, G .
2000 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN: VLSI IN COMPUTERS & PROCESSORS, PROCEEDINGS, 2000, :335-342
[22]  
Fowers J, 2012, FPGA 12: PROCEEDINGS OF THE 2012 ACM-SIGDA INTERNATIONAL SYMPOSIUM ON FIELD PROGRAMMABLE GATE ARRAYS, P47
[23]   Efficient run-time support for irregular task computations with mixed granularities [J].
Fu, C ;
Yang, T .
10TH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM - PROCEEDINGS OF IPPS '96, 1996, :823-830
[24]  
GHIASI S, 2004, ACM T EMBED COMPUT S, V3, P237
[25]   A Taxonomy of Reconfigurable Single-/Multiprocessor Systems-on-Chip [J].
Goehringer, Diana ;
Perschke, Thomas ;
Huebner, Michael ;
Becker, Juergen .
INTERNATIONAL JOURNAL OF RECONFIGURABLE COMPUTING, 2009, 2009
[26]  
Gohringer D., 2011, INT J RECONFIG COMPU
[27]  
Gohringer Diana., 2010, PARALLEL DISTRIBUTED, P1
[28]   Stream-oriented FPGA computing in the Streams-C high level language [J].
Gokhale, M ;
Stone, J ;
Arnold, J ;
Kalinowski, M .
2000 IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2000, :49-56
[29]  
Grama A., 2003, Introduction to Parallel Computing, V2
[30]   Recent advances in direct methods for solving unsymmetric sparse systems of linear equations [J].
Gupta, A .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2002, 28 (03) :301-324