A high-performance data path for synthesizing DSP kernels

被引:21
作者
Galanis, Michalis D. [1 ]
Theodoridis, George
Tragoudas, Spyros
Goutis, Costas E.
机构
[1] Univ Patras, Very Large Scale Integrat Design Lab, Dept Elect & Comp Engn, Rion, Greece
[2] Aristotle Univ Thessaloniki, Dept Phys, Thessaloniki 54124, Greece
[3] So Illinois Univ, Dept Elect & Comp Engn, Carbondale, IL 62901 USA
关键词
binding; chaining; high-performance data-path; scheduling; template units;
D O I
10.1109/TCAD.2005.855965
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A high-performance data path to implement digital signal processing (DSP) kernels is introduced in this paper. The data path is realized by a flexible computational component (FCC), which is a pure combinational circuit and-it can implement any 2 x 2 template (cluster) of primitive resources. Thus, the data path's performance benefits from the intracomponent chaining of operations. Due to the flexible structure of the FCC, the data path is implemented by a small number of such components. This allows for direct connections among FCCs and for exploiting intercomponent chaining, which further improves performance. Due to the universality and flexibility of the FCC, simple and efficient algorithms perform scheduling and binding of the data flow graph (DFG). DSP benchmarks synthesized with the FCC data path method show significant performance improvements when compared with template-based data path designs. Detailed results on execution time, FCC utilization, and area are presented.
引用
收藏
页码:1154 / 1163
页数:10
相关论文
共 22 条
[1]  
Atasu K, 2003, DES AUT CON, P256
[2]  
Bhaskaran V., 1997, IMAGE VIDEO COMPRESS
[3]  
CATTHOOR F, 1996, ACCELERATOR DATAPATH
[4]  
Cheung N, 2003, ICCAD-2003: IEEE/ACM DIGEST OF TECHNICAL PAPERS, P291
[5]   Automatic design of application specific instruction set extensions through dataflow graph exploration [J].
Clark, N ;
Zhong, HT ;
Tang, WK ;
Mahlke, S .
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2003, 31 (06) :429-449
[6]  
Cong J., 2004, P ACM SIGDA 12 INT S, P183, DOI DOI 10.1145/968280.968307
[7]   Performance optimization using template mapping for datapath-intensive high-level synthesis [J].
Corazao, MR ;
Khalaf, MA ;
Guerra, LM ;
Potkonjak, M ;
Rabaey, JM .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 1996, 15 (08) :877-888
[8]  
De Micheli Giovanni, 1994, Synthesis and Optimization of Digital Circuits
[9]   POWER2 FLOATING-POINT UNIT - ARCHITECTURE AND IMPLEMENTATION [J].
HICKS, TN ;
FRY, RE ;
HARVEY, PE .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1994, 38 (05) :525-536
[10]   Instruction generation for hybrid reconfigurable systems [J].
Kastner, R ;
Kaplan, A ;
Memik, SO ;
Bozorgzadeh, E .
ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2002, 7 (04) :605-627