Reducing Static Energy in Supercomputer Interconnection Networks Using Topology-Aware Partitioning

被引:6
作者
Chen, Juan [1 ]
Tang, Yuhua [1 ]
Dong, Yong [1 ]
Xue, Jingling [2 ]
Wang, Zhiyuan [1 ]
Zhou, Wenhao [1 ]
机构
[1] Natl Univ Def Technol, Sch Comp, State Key Lab High Performance Comp, Changsha 410073, Hunan, Peoples R China
[2] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW 2052, Australia
基金
澳大利亚研究理事会; 中国国家自然科学基金;
关键词
Supercomputer interconnection networks; topology-aware partitioning; static energy management; switching off unused routers; Tianhe-2; POWER; CHIP;
D O I
10.1109/TC.2015.2493523
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The key to reducing static energy in supercomputers is switching off their unused components. Routers are the major components of a supercomputer. Whether routers can be effectively switched off or not has become the key to static energy management for supercomputers. For many typical applications, the routers in a supercomputer exhibit low utilization. However, there is no effective method to switch the routers off when they are idle. By analyzing the router occupancy in time and space, for the first time, we present a routing-policy guided topology partitioning methodology to solve this problem. We propose topology partitioning methods for three kinds of commonly used topologies (mesh, torus and fat-tree) equipped with the three most popular routing policies (deterministic routing, directionally adaptive routing and fully adaptive routing). Based on the above methods, we propose the key techniques required in this topology partitioning based static energy management in supercomputer interconnection networks to switch off unused routers in both time and space dimensions. Three topology-aware resource allocation algorithms have been developed to handle effectively different job-mixes running on a supercomputer. We validate the effectiveness of our methodology by using Tianhe-2 and a simulator for the aforementioned topologies and routing policies. The energy savings achieved on a subsystem of Tianhe-2 range from 3.8 to 79.7 percent. This translates into a yearly energy cost reduction of up to half a million US dollars for Tianhe-2.
引用
收藏
页码:2588 / 2602
页数:15
相关论文
共 35 条
[21]   The TH Express high performance interconnect networks [J].
Pang, Zhengbin ;
Xie, Min ;
Zhang, Jun ;
Zheng, Yi ;
Wang, Guibin ;
Dong, Dezun ;
Suo, Guang .
FRONTIERS OF COMPUTER SCIENCE, 2014, 8 (03) :357-366
[22]   Mapping to Irregular Torus Topologies and Other Techniques for Petascale Biomolecular Simulation [J].
Phillips, James C. ;
Sun, Yanhua ;
Jain, Nikhil ;
Bohm, Eric J. ;
Kale, Laxmikant V. .
SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, :81-91
[23]  
Raponi P. G., 2011, 2011 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum, P1017, DOI 10.1109/IPDPS.2011.249
[24]  
Samih A, 2013, INT S HIGH PERF COMP, P508, DOI 10.1109/HPCA.2013.6522345
[25]  
Simon H., 2010, EXASCALE CHALLENGES
[26]  
Turner JC, 2008, ANZSOG MONOGR, P57
[27]  
Van Leeuwen Jan., 1990, Handbook of Theoretical Computer Science: Algorithms and Complexity, V1
[28]   TIANHE-1A INTERCONNECT AND MESSAGE-PASSING SERVICES [J].
Xie, Min ;
Lu, Yutong ;
Wang, Kefei ;
Liu, Lu ;
Cao, Hongjia ;
Yang, Xuejun .
IEEE MICRO, 2012, 32 (01) :8-20
[29]  
[徐小文 Xu Xiaowen], 2012, [计算物理, Chinese Journal of Computational Physics], V29, P684
[30]  
Xu Xiaowen, 2005, Mathematica Numerica Sinica, V27, P325