Compiler-assisted power optimization for clustered VLIW architectures

被引:6
作者
Nagpal, Rahul [1 ]
Srikant, Y. N. [1 ]
机构
[1] Indian Inst Sci, Dept Comp Sci & Automat, Bangalore 560012, Karnataka, India
关键词
Scheduling; Clustered VLIW processors; Energy-aware scheduling; LEAKAGE ENERGY; REGISTER FILE;
D O I
10.1016/j.parco.2010.08.005
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Clustered VLIW architectures solve the scalability problem associated with flat VLIW architectures by partitioning the register file and connecting only a subset of the functional units to a register file. However, inter-cluster communication in clustered architectures leads to increased leakage in functional components and a high number of register accesses. In this paper, we propose compiler scheduling algorithms targeting two previously ignored power-hungry components in clustered VLIW architectures, viz., instruction decoder and register file. We consider a split decoder design and propose a new energy-aware instruction scheduling algorithm that provides 14.5% and 17.3% benefit in the decoder power consumption on an average over a purely hardware based scheme in the context of 2-clustered and 4-clustered VLIW machines. In the case of register files, we propose two new scheduling algorithms that exploit limited register snooping capability to reduce extra register file accesses. The proposed algorithms reduce register file power consumption on an average by 6.85% and 11.90% (10.39% and 17.78%), respectively, along with performance improvement of 4.81% and 5.34% (9.39% and 11.16%) over a traditional greedy algorithm for 2-clustered (4-clustered) VLIW machine. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:42 / 59
页数:18
相关论文
共 43 条
[1]   Efficient backtracking instruction schedulers [J].
Abraham, SG ;
Meleis, WM ;
Baev, ID .
2000 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PROCEEDINGS, 2000, :301-308
[2]   Power-aware compilation for register file energy reduction [J].
Ayala, JL ;
Veidenbaum, A ;
López-Vallejo, M .
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2003, 31 (06) :451-467
[3]  
Azevedo A., 2002, P DES AUT TEST EUR C
[4]   Reducing the complexity of the register file in dynamic superscalar processors [J].
Balasubramonian, R ;
Dwarkadas, S ;
Albonesi, DH .
34TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO-34, PROCEEDINGS, 2001, :237-248
[5]   Region-based hierarchical operation partitioning for multicluster processors [J].
Chu, M ;
Fan, K ;
Mahlke, S .
ACM SIGPLAN NOTICES, 2003, 38 (05) :300-311
[6]  
COOPER KD, 2002, P WORK COMP OP SYST
[7]  
DERBY J, 2003, P 2003 INT C AC SPEE
[8]  
DESOLI G, 1998, INSTRUCTION ASSIGNME
[9]   Managing static leakage energy in microprocessor functional units [J].
Dropsho, S ;
Kursun, V ;
Albonesi, DH ;
Dwarkadas, S ;
Friedman, EG .
35TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO-35), PROCEEDINGS, 2002, :321-332
[10]  
Faraboschi P, 2000, PROCEEDING OF THE 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, P203, DOI [10.1145/342001.339682, 10.1109/ISCA.2000.854391]