Fine-Grained DVFS Using On-Chip Regulators

被引:88
作者
Eyerman, Stijn [1 ]
Eeckhout, Lieven [1 ]
机构
[1] Univ Ghent, ELIS Dept, B-9000 Ghent, Belgium
关键词
Design; Performance; Experimentation; Energy-efficiency; on-chip voltage regulators; fine-grained DVFS; POWER; CORE; PERFORMANCE; FREQUENCY; CONVERTER; ENERGY;
D O I
10.1145/1952998.1952999
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Limit studies on Dynamic Voltage and Frequency Scaling (DVFS) provide apparently contradictory conclusions. On the one hand early limit studies report that DVFS is effective at large timescales (on the order of million(s) of cycles) with large scaling overheads (on the order of tens of microseconds), and they conclude that there is no need for small overhead DVFS at small timescales. Recent work on the other hand-motivated by the surge of on-chip voltage regulator research-explores the potential of fine-grained DVFS and reports substantial energy savings at timescales of hundreds of cycles (while assuming no scaling overhead). This article unifies these apparently contradictory conclusions through a DVFS limit study that simultaneously explores timescale and scaling speed. We find that coarse-grained DVFS is unaffected by timescale and scaling speed, however, fine-grained DVFS may lead to substantial energy savings for memory-intensive workloads. Inspired by these insights, we subsequently propose a fine-grained microarchitecture-driven DVFS mechanism that scales down voltage and frequency upon individual off-chip memory accesses using on-chip regulators. Fine-grained DVFS reduces energy consumption by 12% on average and up to 23% over a collection of memory-intensive workloads for an aggressively clock-gated processor, while incurring an average 0.08% performance degradation (and at most 0.14%). We also demonstrate that the proposed fine-grained DVFS mechanism is orthogonal to existing coarse-grained DVFS policies, and further reduces energy by 6% on average and up to 11% for memory-intensive applications with limited performance impact (at most 0.7%).
引用
收藏
页数:24
相关论文
共 35 条
[1]   A multistage interleaved synchronous buck converter with integrated output filter in 0.18 μm SiGe process [J].
Abedinpour, Siamak ;
Bakkaloglu, Bertan ;
Kiael, Sayfe .
IEEE TRANSACTIONS ON POWER ELECTRONICS, 2007, 22 (06) :2164-2175
[2]  
Brooks D, 2000, PROCEEDING OF THE 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, P83, DOI 10.1109/ISCA.2000.854380
[3]   Design issues for dynamic voltage scaling [J].
Burd, TD ;
Brodersen, RW .
ISLPED '00: PROCEEDINGS OF THE 2000 INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, 2000, :9-14
[4]   Microarchitecture optimizations for exploiting memory-level parallelism [J].
Chou, Y ;
Fahs, B ;
Abraham, S .
31ST ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS, 2004, :76-87
[5]   An embedded 32-b microprocessor core for low-power and high-performance applications [J].
Clark, LT ;
Hoffman, EJ ;
Miller, J ;
Biyani, M ;
Liao, YY ;
Strazdus, S ;
Morrow, M ;
Velarde, KE ;
Yarch, MA .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2001, 36 (11) :1599-1608
[6]  
Dorsey J., 2007, IEEE INT SOL STAT CI, P102
[7]   A Mechanistic Performance Model for Superscalar Out-of-Order Processors [J].
Eyerman, Stijn ;
Eeckhout, Lieven ;
Karkhanis, Tejas ;
Smith, James E. .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2009, 27 (02)
[8]  
GLEW A, 1998, ASPLOS WILD CRAZ ID
[9]   A 233-MHz 80%-87% elfficient four-phase DC-DC converter utilizing air-core inductors on package [J].
Hazucha, P ;
Schrom, G ;
Hahn, J ;
Bloechel, BA ;
Hack, P ;
Dermer, GE ;
Narendra, S ;
Gardner, D ;
Karnik, T ;
De, V ;
Borkar, S .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2005, 40 (04) :838-845
[10]  
HSU CH, 2003, P ACM SIGPLAN 2003 C, P38, DOI DOI 10.1145/781131.781137