An Effective Gray-Box Identification Procedure for Multicore Thermal Modeling

被引:40
作者
Beneventi, Francesco [1 ]
Bartolini, Andrea [1 ]
Tilli, Andrea [1 ]
Benini, Luca [1 ]
机构
[1] Univ Bologna, Dept Elect Comp Sci & Syst, I-40136 Bologna, Italy
关键词
Thermal control; thermal model; multicore; power model; gray box; system identification;
D O I
10.1109/TC.2012.293
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Aggressive thermal management is a critical feature for high-end computing platforms, as worst-case thermal budgeting is becoming unaffordable. Reactive thermal management, which sets temperature thresholds to trigger thermal capping actions, is too "near-sighted," and it may lead to severe performance degradation and thermal overshoots. More aggressive proactive thermal managements minimize performance penalty with smooth optimal control. These techniques require knowledge of thermal models, which have to be accurate and simple to make the controls effective, while keeping their complexity limited. In practice, these models are not provided by manufacturers, and in most cases, they strongly depend on the deployment environment. Hence, procedures to automatically derive thermal models in the field are needed. In this paper, we propose a gray-box procedure to learn a compact and physically consistent model for multicore chips. We leverage the physical consistency of the proposed model to tame the model complexity and to face large quantization noise in measurements. We exploit Output Error structures along with Levenberg-Marquardt and Least Squares optimization algorithms. We tackle the problem in a real-life contest: we developed a complete infrastructure for model building and thermal data collection in the Linux environment, and we tested it on an Intel Nehalem-based server CPU.
引用
收藏
页码:1097 / 1110
页数:14
相关论文
共 51 条
  • [1] A-zisik M.N., 1993, Heat conduction
  • [2] [Anonymous], 2010, Proc. ACM Great Lakes Symp. VLSI, DOI 10.1145/1785481.1785532
  • [3] [Anonymous], P DATE MAR
  • [4] [Anonymous], 1999, SYSTEM IDENTIFICATIO
  • [5] Thermal and Energy Management of High-Performance Multicores: Distributed and Self-Calibrating Model-Predictive Controller
    Bartolini, Andrea
    Cacciari, Matteo
    Tilli, Andrea
    Benini, Luca
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2013, 24 (01) : 170 - 183
  • [6] Bartolini Andrea., 2011, Design, Automation Test in Europe Conference Exhibition DATE, P1
  • [7] Bartolini Andrea., 2010, Proceedings of the 20th symposium on Great lakes symposium on VLSI, P311
  • [8] Benedetto F., 2011, 2011 International Conference on Localization and GNSS (ICL-GNSS), P1, DOI 10.1109/ICL-GNSS.2011.5955253
  • [9] Bertran Ramon, 2010, 24th ACM International Conference on Supercomputing 2010, P147
  • [10] Understanding the thermal implications of multicore architectures
    Chaparro, Pedro
    Gonzalez, Jose
    Magklis, Grigorios
    Cai, Qiong
    Gonzalez, Antonio
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2007, 18 (08) : 1055 - 1065