Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Framework

被引:26
作者
Zhao, Yaxiong [1 ,2 ]
Wu, Jie [2 ]
Liu, Cong [3 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
[2] Temple Univ, Philadelphia, PA 19122 USA
[3] Sun Yat Sen Univ, Guangzhou 510275, Guangdong, Peoples R China
关键词
big-data; MapReduce; Hadoop; caching;
D O I
10.1109/TST.2014.6733207
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The buzz-word big-data refers to the large-scale distributed data processing applications that operate on exceptionally large amounts of data. Google's MapReduce and Apache's Hadoop, its open-source implementation, are the defacto software systems for big-data applications. An observation of the MapReduce framework is that the framework generates a large amount of intermediate data. Such abundant information is thrown away after the tasks finish, because MapReduce is unable to utilize them. In this paper, we propose Dache, a data-aware cache framework for big-data applications. In Dache, tasks submit their intermediate results to the cache manager. A task queries the cache manager before executing the actual computing work. A novel cache description scheme and a cache request and reply protocol are designed. We implement Dache by extending Hadoop. Testbed experiment results demonstrate that Dache significantly improves the completion time of MapReduce jobs.
引用
收藏
页码:39 / 50
页数:12
相关论文
共 22 条
  • [1] Acar U. A., 2009, P PEPM 09 NEW YORK N
  • [2] [Anonymous], 2003, P 19 ACM S OP SYST P, DOI [10.1145/1165389.945450, DOI 10.1145/1165389.945450]
  • [3] Battr'e D., 2010, P SOCC 2010 NEW YORK
  • [4] Chang F., 2006, P OSDI 2006 BERK CA
  • [5] Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
  • [6] The many faces of publish/subscribe
    Eugster, PT
    Felber, PA
    Guerraoui, R
    Kermarrec, AM
    [J]. ACM COMPUTING SURVEYS, 2003, 35 (02) : 114 - 131
  • [7] Gonzalez H., 2010, P SOCC 2010 NEW YORK
  • [8] He B., 2010, P SOCC 2011 NEW YORK
  • [9] Herodotou H., 2011, P SOCC 2011 NEW YORK
  • [10] Isard M., 2007, Operating Systems Review, V41, P59, DOI 10.1145/1272998.1273005