Design and implementation of high performance matrix inversion based on reconfigurable processor

被引:2
作者
Wang, Kun [1 ]
Li, Li [1 ]
Han, Feng [1 ]
Feng, Fan [1 ]
Lin, Jun [1 ]
机构
[1] Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210023, Jiangsu, Peoples R China
来源
IEICE ELECTRONICS EXPRESS | 2016年 / 13卷 / 15期
基金
高等学校博士学科点专项科研基金;
关键词
reconfigurable processor; matrix inversion; LU decomposition; parallel computing; time-sharing multiplexing; ARCHITECTURE;
D O I
10.1587/elex.13.20160579
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we propose a high performance matrix inversion implementation on a reconfigurable application specific processor. Our implementation can accelerate variable order matrix inversion ranging from 4 to 144. We adopt LU decomposition to reduce the computation complexity and a pivoting operation to ensure the stability. In order to get higher performance within the limited resources, parallel computing and time-sharing multiplexing are employed. The chip testing results show that our implementation improve the performance of inversion efficiently. The highest parallel speed-up ratio can achieve 3 times, and the execution time of a 144 x 144 matrix inversion is 4.07 ms.
引用
收藏
页数:12
相关论文
共 16 条
  • [1] [Anonymous], 2012, MATRIX COMPUTATIONS
  • [2] Arias-Garca J., 2011, 7 SO C PROGR LOG, P263, DOI [10.1109/SPL.2011.5782659, DOI 10.1109/SPL.2011.5782659]
  • [3] Hnilicka O, 2013, IEEE INT SYMP DESIGN, P267, DOI 10.1109/DDECS.2013.6549831
  • [4] FPGA-Based High-Performance and Scalable Block LU Decomposition Architecture
    Jaiswal, Manish Kumar
    Chandrachoodan, Nitin
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2012, 61 (01) : 60 - 72
  • [5] A design flow for architecture exploration and implementation of partially reconfigurable processors
    Karuri, Kingshuk
    Chattopadhyay, Anupam
    Chen, Xiaolin
    Kammler, David
    Hao, Ling
    Leupers, Rainer
    Meyr, Heinrich
    Ascheid, Gerd
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2008, 16 (10) : 1281 - 1294
  • [6] Dynamic Context Compression for Low-Power Coarse-Grained Reconfigurable Architecture
    Kim, Yoonjin
    Mahapatra, Rabi N.
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2010, 18 (01) : 15 - 28
  • [7] Scalable linear array architectures for matrix inversion using Bi-z CORDIC
    Luo, J. W.
    Jong, C. C.
    [J]. MICROELECTRONICS JOURNAL, 2012, 43 (02) : 141 - 153
  • [8] Luo J, 2013, 2013 6TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP), VOLS 1-3, P1649, DOI 10.1109/CISP.2013.6743941
  • [9] QR Decomposition-Based Matrix Inversion for High Performance Embedded MIMO Receivers
    Ma, Lei
    Dickson, Kevin
    McAllister, John
    McCanny, John
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2011, 59 (04) : 1858 - 1867
  • [10] Moussa S., 2013, 26 ANN IEEE CAN C EL, P1, DOI [10.1109/CCECE.2013.6567785, DOI 10.1109/CCECE.2013.6567785]