bde05 The Architecture and the Technology Characterization of an FPGA-based Customizable Application-Specific Vector Processor

被引:0
作者
Sykora, Jaroslav [1 ]
Kohout, Lukas [1 ]
Bartosinski, Roman [1 ]
Kafka, Leos [1 ]
Danek, Martin [1 ]
Honzik, Petr [2 ]
机构
[1] ASCR, Vvi, Inst Informat Theory & Automat UTIA, Dept Signal Proc, Vodarenskou Vezi 4, Prague, Czech Republic
[2] CIP Plus sro, Pribram, Czech Republic
来源
2012 IEEE 15TH INTERNATIONAL SYMPOSIUM ON DESIGN AND DIAGNOSTICS OF ELECTRONIC CIRCUITS & SYSTEMS (DDECS) | 2012年
关键词
Custom accelerators; vector processing; FPGA; DSP;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The traditional approach to IP core design is to use simulations with test vectors. This is not feasible when dealing with complex function cores such as the Image Segmentation case-study algorithm in this paper. An algorithm developer needs to carry out experiments on large real-world data sets, with fast turn-around times, and in real time to facilitate performance tuning and incremental development. We propose a methodology called Application-Specific Vector Processor (ASVP). The ASVP approach first constructs a programmable architecture customized for a given application, then employs software techniques to develop firmware that implements the algorithm. Our sample implementation that supports the Image Segmentation kernel is capable of 332 MFLOPs, 400 MFLOPs, and 250 MFLOPs per coprocessor core in Virtex 5, Virtex 6 and Spartan 6 technologies, respectively. The core size is roughly 1500 slices, depending on the configuration and technology.
引用
收藏
页码:62 / 67
页数:6
相关论文
共 12 条
[1]   INCREASING THE LEVEL OF ABSTRACTION IN FPGA-BASED DESIGNS [J].
Danek, Martin ;
Kadlec, Jiri ;
Bartosinski, Roman ;
Kohout, Lukas .
2008 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE AND LOGIC APPLICATIONS, VOLS 1 AND 2, 2008, :5-10
[2]  
Hofstee HP, 2009, INTEGR CIRCUIT SYST, P271, DOI 10.1007/978-1-4419-0263-4_9
[3]  
Kaewtrakulpong P., 2001, IMPROVED ADAPTIVE BA
[4]   Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks [J].
Kozyrakis, C ;
Patterson, D .
35TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO-35), PROCEEDINGS, 2002, :283-293
[5]   Synthesis of Platform Architectures from OpenCL Programs [J].
Owaida, Muhsen ;
Bellas, Nikolaos ;
Daloukas, Konstantis ;
Antonopoulos, Christos D. .
2011 IEEE 19TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2011, :186-193
[6]  
Papakonstantinou A., 2009, 2009 IEEE 7 S APPL S, DOI DOI 10.1109/SASP.2009.5226333
[7]   Multilevel Granularity Parallelism Synthesis on FPGAs [J].
Papakonstantinou, Alexandros ;
Liang, Yun ;
Stratton, John A. ;
Gururaj, Karthik ;
Chen, Deming ;
Hwu, Wen-Mei W. ;
Cong, Jason .
2011 IEEE 19TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2011, :178-185
[8]   A bandwidth-efficient architecture for media processing [J].
Rixner, S ;
Dally, WJ ;
Kapasi, UJ ;
Khailany, B ;
López-Lagunas, A ;
Mattson, PR ;
Owens, JD .
31ST ANNUAL ACM/IEEE INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1998, :3-13
[9]  
Sykora J., 2012, OPTIMIZING C COMPILE
[10]   Designing Modular Hardware Accelerators in C With ROCCC 2.0 [J].
Villarreal, Jason ;
Park, Adrian ;
Najjar, Walid ;
Halstead, Robert .
2010 18TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2010), 2010, :127-134