A Unitable Computing Architecture for Chip Multiprocessors

被引:2
作者
Chiu, Jih-Ching [1 ]
Chou, Yu-Liang [1 ]
Chen, Po-Kai [1 ]
Su, Ding-Siang [1 ]
机构
[1] Natl Sun Yat Sen Univ, Dept Elect Engn, Kaohsiung, Taiwan
关键词
reconfigurable hardwares; MIMD processors; dynamic multi-core; superscalar; CMP; chip multiprocessors;
D O I
10.1093/comjnl/bxr085
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a unitable multi-core architecture, called hyperscalar, that can dynamically unite many scalar cores as a larger superscalar processor to accelerate a thread. To accomplish this, this paper proposes the virtual shared register files (VSRFs) that help the instructions of a thread in different cores can logically face a uniform set of register files. We also propose an instruction analyzer that can detect and tag the dependence information to the newly fetched instructions. With the tags, instructions in the united cores can issue requests to obtain their remote operands via the VSRF. Thus, the dependences arising among instructions in different cores can be resolved. Moreover, some extended instructions are defined for programmers to grow or shrink the number of united cores to match the available instruction level parallelism for different applications. The reconfigurable feature of hyperscalar covers a spectrum of workloads well, providing high single-thread performance when thread level parallelism (TLP) is low and high throughput when TLP is high. Simulation results show that the eight-core hyperscalar chip multiprocessor's two-, four-and eight-core-united configurations archive 93, 80 and 76% of the performance of the monolithic two-, four-and eight-issue out-of-order superscalar processors with lower area costs and better support for software diversity.
引用
收藏
页码:2033 / 2052
页数:20
相关论文
共 33 条
[1]   A 65nm logic technology featuring 35nm gate lengths, enhanced channel strain, 8 Cu interconnect layers, low-k ILD and 0.57 μm2 SRAM cell [J].
Bai, P ;
Auth, C ;
Balakrishnan, S ;
Bost, M ;
Brain, R ;
Chikarmane, V ;
Heussner, R ;
Hussein, M ;
Hwang, J ;
Ingerly, D ;
James, R ;
Jeong, J ;
Kenyon, C ;
Lee, E ;
Lee, SH ;
Lindert, N ;
Liu, M ;
Ma, Z ;
Marieb, T ;
Murthy, A ;
Nagisetty, R ;
Natarajan, S ;
Neirynck, J ;
Ott, A ;
Parker, C ;
Sebastian, J ;
Shaheed, R ;
Sivakurnar, S ;
Steigerwald, J ;
Tyagi, S ;
Weber, C ;
Woolery, B ;
Yeoh, A ;
Zhang, K ;
Bohr, M .
IEEE INTERNATIONAL ELECTRON DEVICES MEETING 2004, TECHNICAL DIGEST, 2004, :657-660
[2]  
CHIU J, 2006, P INT COMP S TAIP TA, P21
[3]   A Hyperscalar Multi-core Architecture [J].
Chiu, Jih-Ching ;
Chou, Yu-Liang ;
Su, Ding-Siang .
PROCEEDINGS OF THE 2010 COMPUTING FRONTIERS CONFERENCE (CF 2010), 2010, :77-78
[4]  
COLLINS JD, 2004, P 18 INT PAR DISTR P
[5]   Tarantula: A vector extension to the alpha architecture [J].
Espasa, R ;
Ardanaz, F ;
Emer, J ;
Felix, S ;
Gago, J ;
Gramunt, R ;
Hernandez, I ;
Juan, T ;
Lowney, G ;
Mattina, M ;
Seznec, A .
29TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS, 2002, :281-292
[6]   MiBench: A free, commercially representative embedded benchmark suite [J].
Guthaus, MR ;
Ringenberg, JS ;
Ernst, D ;
Austin, TM ;
Mudge, T ;
Brown, RB .
WWC-4: IEEE INTERNATIONAL WORKSHOP ON WORKLOAD CHARACTERIZATION, 2001, :3-14
[7]   SPEC CPU2000: Measuring CPU performance in the new millennium [J].
Henning, JL .
COMPUTER, 2000, 33 (07) :28-+
[8]   Amdahl's law in the multicore era [J].
Hill, Mark D. ;
Marty, Michael R. .
COMPUTER, 2008, 41 (07) :33-+
[9]  
Ipek E, 2007, CONF PROC INT SYMP C, P186, DOI 10.1145/1273440.1250686
[10]  
Jih-Ching Chiu, 2010, Proceedings 39th International Conference on Parallel Processing (ICPP 2010), P277, DOI 10.1109/ICPP.2010.35