Modular Switched Multiported SRAM-Based Memories

被引:4
作者
Abdelhadi, Ameer M. S. [1 ]
Lemieux, Guy G. F. [1 ]
机构
[1] Univ British Columbia, Dept Elect & Comp Engn, 2332 Main Mall, Vancouver, BC V6T 1Z4, Canada
基金
美国国家科学基金会; 加拿大自然科学与工程研究理事会;
关键词
Design; Algorithms; Performance Embedded memory; programmable memory; block RAM; multiported memory; shared memory; cache memory; register file; parallel memory access; ARCHITECTURE;
D O I
10.1145/2851506
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Multiported RAMs are essential for high-performance parallel computation systems. VLIW and vector processors, CGRAs, DSPs, CMPs, and other processing systems often rely upon multiported memories for parallel access. Although memories with a large number of read and write ports are important, their high implementation cost means that they are used sparingly. As a result, FPGA vendors only provide dual-ported block RAMs (BRAMs) to handle the majority of usage patterns. Furthermore, recent attempts to create FPGA-based multiported memories suffer from low storage utilization. Whereas most approaches provide simple unidirectional ports with a fixed read or write, others propose true bidirectional ports where each port dynamically switches read and write. True RAM ports are useful for systems with transceivers and provide high RAM flexibility; however, this flexibility incurs high BRAM consumption. In this article, a novel, modular, and BRAM-based switched multiported RAM architecture is proposed. In addition to unidirectional ports with fixed read/write, this switched architecture allows a group of write ports to switch with another group of read ports dynamically, hence altering the number of active ports. The proposed switched-ports architecture is less flexible than a true-multiported RAM where each port is switched individually. Nevertheless, switched memories can dramatically reduce BRAM consumption compared to true ports for systems with alternating port requirements. Previous live-value-table (LVT) and XOR approaches are merged and optimized into a generalized and modular structure that we call an invalidation-based live-value-table (I-LVT). Like a regular LVT, the I-LVT determines the correct bank to read from, but it differs in how updates to the table are made; the LVT approach requires multiple write ports, often leading to an area-intensive register-based implementation, whereas the XOR approach suffers from excessive storage overhead since wider memories are required to accommodate the XOR-ed data. Two specific I-LVT implementations are proposed and evaluated: binary and thermometer coding. The I-LVT approach is especially suitable for deep memories because the table is implemented only in SRAM cells. The I-LVT method gives higher performance while occupying fewer BRAMs than earlier approaches: for several configurations, BRAM usage is reduced by greater than 44% and clock speed is improved by greater than 76%. The I-LVT can be used with fixed ports, true ports, or the proposed switched ports architectures. Formal proofs for the suggested methods, resources consumption analysis, usage guidelines, and analytic comparison to other methods are provided. A fully parameterized Verilog implementation is released as an open source library. The library has been extensively tested using Altera's EDA tools.
引用
收藏
页数:26
相关论文
共 21 条
  • [1] Abdelhadi Ameer M. S., 2015, SWITCHED MULTIPORTED
  • [2] Abdelhadi G. G., 2014, P ACM SIGDA INT S FI, P35, DOI DOI 10.1145/2554688.2554773
  • [3] ARCHITECTURE OF THE PENTIUM MICROPROCESSOR
    ALPERT, D
    AVNON, D
    [J]. IEEE MICRO, 1993, 13 (03) : 11 - 21
  • [4] Altera Corporation, 2013, STRAT 5 DEV HDB
  • [5] Bajwa H., 2007, P INT C EL ENG ICEE, P1
  • [6] Brant A, 2012, 2012 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT'12), P235, DOI 10.1109/FPT.2012.6412140
  • [7] Chappell B. A., 1996, U.S. Patent, Patent No. [5 542 067, 5542067]
  • [8] Impact of Cache Architecture and Interface on Performance and Area of FPGA-Based Processor/Parallel-Accelerator Systems
    Choi, Jongsok
    Nam, Kevin
    Canis, Andrew
    Anderson, Jason
    Brown, Stephen
    Czajkowski, Tomasz
    [J]. 2012 IEEE 20TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2012, : 17 - 24
  • [9] Ditlow G. S., 2011, 2011 IEEE International Solid-State Circuits Conference (ISSCC 2011), P256, DOI 10.1109/ISSCC.2011.5746308
  • [10] Fetzer E. S., 2002, 2002 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.02CH37315), P420, DOI 10.1109/ISSCC.2002.993111