Merging VLIW and vector processing techniques for a simple, high-performance processor architecture

被引:0
|
作者
Soliman, Mostafa I. [1 ,2 ]
机构
[1] Taibah Univ, Comp Sci & Informat Dept, Community Coll, Al Adinah Al Unawwarah 2898, Saudi Arabia
[2] Aswan Univ, Dept Elect Engn, Comp & Syst Sect, Fac Engn, Aswan 81542, Egypt
关键词
Data-parallel applications; VLIW; Vector processing; VHDL; Performance evaluation; SUPERSCALAR; LEVEL; CORE;
D O I
10.1016/j.mejo.2015.03.012
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a new processor architecture called VVSHP for accelerating data-parallel applications, which are growing in importance and demanding increased performance from hardware. VVSHP merges VLIW and vector processing techniques for a simple, high-performance processor architecture. One key point of VVSHP is the execution of multiple scalar instructions within VLIW and vector instructions on unified parallel execution datapaths. Another key point is to reduce the complexity of VVSHP by designing a two-part register file: (1) shared scalar-vector part with eight-read/four-write ports 64 x 32-bit registers (64 scalar or 16 x 4 vector registers) for storing scalar/vector data and (2) vector part with two-read/one-write ports 48 vector-registers, each stores 4 x 32-bit vector data. Moreover, processing vector data with lengths varying from 1 to 256 represents a key point for reducing the loop overheads. VVSHP can issue up to four scalar/vector operations in each cycle for parallel processing a set of operands and producing up to four results to be written back into VVSHP register file. However, it cannot issue more than one memory operation at a time, which loads/stores 128-bit scalar/vector data from/to data memory. The design of our proposed VVSHP processor is implemented using VHDL targeting the Xilinx FPGA Virtex-5 and its performance is evaluated. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:637 / 655
页数:19
相关论文
共 50 条
  • [31] VLSI implementation of the high performance data path design in VLIW processor
    Yang, Yan
    Hou, Chao-Huan
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2003, 31 (11): : 1667 - 1670
  • [32] ARCHITECTURE OF A VSLI VECTOR QUANTIZATION PROCESSOR FOR SPEECH PROCESSING SYSTEMS
    PREISS, E
    PFLEIDERER, HJ
    NTZ ARCHIV, 1988, 10 (09): : 227 - 236
  • [33] Designing area and performance constrained SIMD/VLIW image processing architecture
    Fatemi, H
    Corporaal, H
    Basten, T
    Kleihorst, R
    Jonker, P
    ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS, PROCEEDINGS, 2005, 3708 : 689 - 696
  • [34] A New Application-Tuned Processor Architecture for High-Performance Reconfigurable Computing
    Shang, Li-Hong
    Zhou, Mi
    Zhang, Jiong
    Li, Hong-Bin
    PROCEEDINGS OF THE 2009 NASA/ESA CONFERENCE ON ADAPTIVE HARDWARE AND SYSTEMS, 2009, : 138 - 143
  • [35] Cluster assignment for high-performance embedded VLIW processors
    Lapinskii, VS
    Jacome, MF
    De Veciana, GA
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2002, 7 (03) : 430 - 454
  • [36] A HIGH-PERFORMANCE FPGA-BASED FUZZY PROCESSOR ARCHITECTURE FOR MEDICAL DIAGNOSIS
    Chowdhury, Shubhajit Roy
    Saha, Hiranmay
    IEEE MICRO, 2008, 28 (05) : 38 - 52
  • [37] Evolution of a high-performance PC architecture data processing system
    Turri, M
    DASIA 99: DATA SYSTEMS IN AEROSPACE, 1999, 447 : 73 - 78
  • [38] Design and program multi-processor platform for high-performance embedded processing
    Liu, Yijun
    Li, Zhenkun
    Journal of Software, 2009, 4 (10) : 1069 - 1075
  • [39] Designing a Multi-Processor Education Board for High-Performance Embedded Processing
    Liu, Yijun
    Wang, Banghai
    Xie, Guobo
    Chen, Pinghua
    Li, Zhenkun
    PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE FOR YOUNG COMPUTER SCIENTISTS, VOLS 1-5, 2008, : 2546 - 2551
  • [40] High-performance architecture
    Sherwin-Williams
    不详
    Finsh. Today, 2007, 2 (22-24):