A synthesis methodology for hybrid custom instruction and coprocessor generation for extensible processors

被引:14
作者
Sun, Fei
Ravi, Srivaths
Raghunathan, Arland
Jha, Niraj K.
机构
[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08544 USA
[2] NEC Labs Amer Inc, Princeton, NJ 08540 USA
基金
美国国家科学基金会;
关键词
Index Terms-Application-specific instruction set processor; coprocessor; custom instruction; extensible processor;
D O I
10.1109/TCAD.2007.906457
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Systems-on-chip often use hardware accelerators or coprocessors to provide efficient implementations of application-specific functions. The recent emergence of extensible processor cores with supporting design tools has given designers with another viable alternative, namely, the use of application-specific custom instructions. Coprocessors and custom instructions can be viewed as two different forms of hardware acceleration that are applicable at different levels of granularity and offer differing tradeoffs. Classical hardware/software-partitioning techniques and application-specific instruction-set design tools address the individual problems of coprocessor generation and custom-instruction addition. However, given a complex applications it is not clear which design choice (coprocessors or custom instructions or a combination) will result in better performance, area, or power consumption. We demonstrate that a combination of custom instructions and coprocessors is often the best solution in many applications, making the case for a hybrid custom-instruction and coprocessor-synthesis methodology. We propose such a methodology that builds upon the basic observations that coprocessors are usually good for coarse-grained tasks and require minimal intervention or support from the processor, while custom instructions are usually suited to fine-grained operations that are best integrated into a processor pipeline. Our methodology uses a hierarchical task-graph representation in order to support both coarse- and fine-grained views of an application, which are necessary to make meaningful tradeoffs. We propose a hierarchical synthesis algorithm that incorporates multiobjective evolutionary optimization in order to handle different design dimensions, such as area and performance, and provide a wide range of nondominated solutions. We have implemented the proposed methodology in the context of a commercial extensible processor-based platform (Xtensa from Tensilica). Our design flow uses a commercial behavioral-synthesis tool and an existing automatic-custominstruction-generation tool. Our experiments with several applications show that simultaneous custom-instruction and coprocessor synthesis can achieve significantly better area/performance tradeoffs than using only one of them.
引用
收藏
页码:2035 / 2045
页数:11
相关论文
共 37 条
  • [1] [Anonymous], CAT C SYNTH
  • [2] [Anonymous], ARCTANGENT PROC
  • [3] [Anonymous], DES COMP
  • [4] Designing domain-specific processors
    Arnold, M
    Corporaal, H
    [J]. PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON HARDWARE/SOFTWARE CODESIGN, 2001, : 61 - 66
  • [5] Atasu K, 2003, DES AUT CON, P256
  • [6] *CEL INC, DK DES SUIT
  • [7] MINCE: Matching INstructions using combinational equivalence for extensible processor
    Cheung, N
    Parameswaran, S
    Henkel, J
    Chan, J
    [J]. DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION, VOLS 1 AND 2, PROCEEDINGS, 2004, : 1020 - 1025
  • [8] Cheung N, 2003, ICCAD-2003: IEEE/ACM DIGEST OF TECHNICAL PAPERS, P291
  • [9] Synthesis of application specific instructions for embedded DSP software
    Choi, H
    Kim, JS
    Yoon, CW
    Park, IC
    Hwang, SH
    Kyung, CM
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 1999, 48 (06) : 603 - 614
  • [10] CHOI H, 1999, P IEEE ACM DES AUT C, P939