Power saving based on characteristics of machine learning in data center

被引:0
作者
机构
[1] School of Computer Science, Fudan University
来源
Wang, Z.-G. (zgwang@fudan.edu.cn) | 1600年 / Chinese Academy of Sciences卷 / 25期
关键词
Distributed computing; Machine learning; MapReduce; PageRank; Power saving;
D O I
10.13328/j.cnki.jos.004601
中图分类号
学科分类号
摘要
With the development of the Internet, the scale of data center increases dramatically. How to analyze the data stored in the data center becomes the hot research topic. Programmers resort to the machine learning to analyze unstructured or semi-structured data. Thus, energy efficient machine learning is crucial for green data centers. Based the observation that there is redundant computation in the machine learning applications, this paper proposes a system which can save the power usage by removing the redundant computations and reusing the previous computation results. Evalution shows that for the typical k-means and PageRank applications the presented algorithm results 23% and 17% power saving. © 2014 ISCAS.
引用
收藏
页码:1432 / 1447
页数:15
相关论文
共 35 条
  • [1] Krikorian R., Twitter by the numbers, (2010)
  • [2] Tam D., Facebook processes more than 500 TB of data daily, (2012)
  • [3] (2014)
  • [4] Wikibon, a comprehensive list of big data statistics, (2012)
  • [5] (2013)
  • [6] Low Y., Gonzalez J., Kyrola A., Bickson D., Guestrin C., Hellerstein J.M., Graphlab: A new framework for parallel machine learning, (2010)
  • [7] The apache software foundation, what is apache mahout, (2014)
  • [8] Qian Z.P., Chen X.W., Kang N.X., Chen M.C., Yu Y., Moscibroda T., Zhang Z., MadLINQ: Large-Scale distributed matrix computation for the cloud, Proc. of the 7th ACM European Conf. on Computer Systems, pp. 197-210, (2012)
  • [9] Williams C., What is a green data center, (2011)
  • [10] Koomey J., Growth in data center electricity use 2005 to 2010, (2011)