Merging VLIW and vector processing techniques for a simple, high-performance processor architecture

被引:0
|
作者
Soliman, Mostafa I. [1 ,2 ]
机构
[1] Taibah Univ, Comp Sci & Informat Dept, Community Coll, Al Adinah Al Unawwarah 2898, Saudi Arabia
[2] Aswan Univ, Dept Elect Engn, Comp & Syst Sect, Fac Engn, Aswan 81542, Egypt
关键词
Data-parallel applications; VLIW; Vector processing; VHDL; Performance evaluation; SUPERSCALAR; LEVEL; CORE;
D O I
10.1016/j.mejo.2015.03.012
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a new processor architecture called VVSHP for accelerating data-parallel applications, which are growing in importance and demanding increased performance from hardware. VVSHP merges VLIW and vector processing techniques for a simple, high-performance processor architecture. One key point of VVSHP is the execution of multiple scalar instructions within VLIW and vector instructions on unified parallel execution datapaths. Another key point is to reduce the complexity of VVSHP by designing a two-part register file: (1) shared scalar-vector part with eight-read/four-write ports 64 x 32-bit registers (64 scalar or 16 x 4 vector registers) for storing scalar/vector data and (2) vector part with two-read/one-write ports 48 vector-registers, each stores 4 x 32-bit vector data. Moreover, processing vector data with lengths varying from 1 to 256 represents a key point for reducing the loop overheads. VVSHP can issue up to four scalar/vector operations in each cycle for parallel processing a set of operands and producing up to four results to be written back into VVSHP register file. However, it cannot issue more than one memory operation at a time, which loads/stores 128-bit scalar/vector data from/to data memory. The design of our proposed VVSHP processor is implemented using VHDL targeting the Xilinx FPGA Virtex-5 and its performance is evaluated. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:637 / 655
页数:19
相关论文
共 50 条
  • [1] HIGH-PERFORMANCE FFTS FOR A VLIW ARCHITECTURE
    RODMAN, PK
    CA-DSP 89, VOLS 1 AND 2: 1989 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND DIGITAL SIGNAL PROCESSING, 1989, : S32 - S46
  • [2] High-performance crossbar interconnect for a VLIW video signal processor
    Dutta, S
    OConnor, KJ
    Wolfe, A
    NINTH ANNUAL IEEE INTERNATIONAL ASIC CONFERENCE AND EXHIBIT, PROCEEDINGS, 1996, : 45 - 49
  • [3] High-performance videophone chip with dual multimedia VLIW processor cores
    Kim, Jeong-Min
    Shin, Yun-Su
    Hwang, In-Gu
    Lee, Kwang-Sun
    Han, Sang-Il
    Park, Sang-Gyu
    Chae, Soo-Ik
    IEICE Transactions on Electronics, 2001, (02) : 183 - 192
  • [4] A high-performance videophone chip with dual multimedia VLIW processor cores
    Kim, JM
    Shin, YS
    Hwang, IG
    Lee, KS
    Han, SI
    Park, SG
    Chae, SI
    IEICE TRANSACTIONS ON ELECTRONICS, 2001, E84C (02): : 183 - 192
  • [5] High-performance and low-cost dual-thread VLIW processor using weld architecture paradigm
    Özer, E
    Conte, TM
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2005, 16 (12) : 1132 - 1142
  • [6] Techniques for implementing high-performance processor cores
    Schulte, Gregory
    Boppana, Vamsi
    EDN, 2009, 54 (15) : 28 - 33
  • [7] Techniques for implementing high-performance processor cores
    Schulte, Gregqry
    Boppana, Vamsi
    EDN, 2009, 54 (16) : 28 - 33
  • [8] Architecture of a high-performance stereo vision VLSI processor
    Hariyama, M
    Lee, S
    Kameyama, M
    ADVANCED ROBOTICS, 2000, 14 (05) : 329 - 332
  • [9] Exploring high-performance processor architecture beyond the exascale
    Xie, Xiang-hui
    Jia, Xun
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2018, 19 (10) : 1224 - 1229
  • [10] HiPReP: High-Performance Reconfigurable Processor - Architecture and Compiler
    Kasgen, Philipp
    Messelka, Mohamed
    Weinhardt, Markus
    2021 31ST INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL 2021), 2021, : 380 - 381