Packet switched vs. time multiplexed FPGA overlay networks

被引:0
作者
Kapre, Nachiket [1 ]
Mehta, Nikil [1 ]
deLorimier, Michael [1 ]
Rubin, Raphael [1 ]
Barnor, Henry [1 ]
Wilson, Michael J. [1 ]
Wrighton, Michael [1 ]
DeHon, Andre [1 ]
机构
[1] CALTECH, Dept CS, MC 256-80, Pasadena, CA 91125 USA
来源
FCCM 2006: 14TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS | 2006年
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Dedicated, spatially configured FPGA interconnect is efficient for applications that require high throughput connections between processing elements (PEs) but with a limited degree of PE interconnectivity (e.g. wiring up gates and datapaths). Applications which virtualize PEs may require a large number of distinct PE-to-PE connections (e.g. using one PE to simulate 100s of operators, each requiring input data from thousands of other operators), but with each connection having low throughput compared with the PE's operating cycle time. In these highly interconnected conditions, dedicating spatial interconnect resources for all possible connections is costly and inefficient. Alternatively, we can time share physical network resources by virtualizing interconnect links, either by statically scheduling the sharing of resources prior to runtime or by dynamically negotiating resources at runtime. We explore the tradeoffs (e.g. area, route latency, route quality) between time-multiplexed and packet-switched networks overlayed on top of commodity FPGAs. We demonstrate modular and scalable networks which operate on a Xilinx XC2V6000-4 at 166MHz. For our applications, time-multiplexed, offline scheduling offers up to a 63% performance increase over online, packet-switched scheduling for equivalent topologies. When applying designs to equivalent area, packet-switching is up to 2x faster for small area designs while timemultiplexing is up to 5x faster for larger area designs. When limited to the capacity of a XC2V6000, if all communication is known, time-multiplexed routing outperforms packet-switching; however when the active set of links drops below 40% of the potential links, packet-switched routing can outperform timemultiplexing.
引用
收藏
页码:205 / +
页数:3
相关论文
共 37 条
[1]  
[Anonymous], [No title captured]
[2]  
[Anonymous], 100 YEARS TELEPHONE
[3]  
[Anonymous], 2005, PROGR LOG DAT BOOK C
[4]  
Babb J., 1993, Proceedings IEEE Workshop on FPGAs for Custom Computing Machines (Cat. No.93TH0535-5), P142, DOI 10.1109/FPGA.1993.279469
[5]  
Benes V. E., 1965, MATH THEORY CONNECTI
[6]  
BENINI L, 2002, IEEE COMPUT, V1, P70
[7]  
BHAT NB, 1993, M9342 UCB
[8]  
Borkar S., 1988, Proceedings. Supercomputing '88 (IEEE Cat. No.88CH2617-9), P330, DOI 10.1109/SUPERC.1988.44670
[9]  
Caldwell A. E., 2000, Proceedings ASP-DAC 2000. Asia and South Pacific Design Automation Conference 2000 with EDA TechnoFair 2000. (Cat. No.00EX389), P661, DOI 10.1109/ASPDAC.2000.835182
[10]  
Dally W, 2004, PRINCIPLES PRACTICES