REEF: Retainable Evaluator Execution Framework

被引:4
作者
Chun, Byung-Gon [1 ]
Douglas, Chris [1 ]
Narayanamurthy, Shravan [1 ]
Rosen, Josh [1 ]
Condie, Tyson [1 ]
Matusevych, Sergiy [1 ]
Ramakrishnan, Raghu [1 ]
Sears, Russell [1 ]
Curino, Carlo [1 ]
Myers, Brandon [1 ]
Rao, Sriram [1 ]
Weimer, Markus [1 ]
机构
[1] Microsoft Cloud Informat Serv Lab, Redmond, WA USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2013年 / 6卷 / 12期
关键词
Computation theory - Computer software - Iterative methods - Computational methods - Semantics - MapReduce - Managers - Reefs - Artificial intelligence - Information management - Learning algorithms - Learning systems;
D O I
10.14778/2536274.2536318
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this demo proposal, we describe REEF, a framework that makes it easy to implement scalable, fault-tolerant runtime environments for a range of computational models. We will demonstrate diverse workloads, including extract-transform-load MapReduce jobs, iterative machine learning algorithms, and ad-hoc declarative query processing. At its core, REEF builds atop YARN (Apache Hadoop 2's resource manager) to provide retainable hardware resources with lifetimes that are decoupled from those of computational tasks. This allows us to build persistent (cross-job) caches and cluster-wide services, but, more importantly, supports high-performance iterative graph processing and machine learning algorithms. Unlike existing systems, REEF aims for composability of jobs across computational models, providing significant performance and usability gains, even with legacy code. REEF includes a library of interoperable data management primitives optimized for communication and data movement (which are distinct from storage locality). The library also allows REEF applications to access external services, such as user-facing relational databases. We were careful to decouple lower levels of REEF from the data models and semantics of systems built atop it. The result was two new standalone systems: Tang, a configuration manager and dependency injector, and Wake, a state-of-the-art event-driven programming and data movement framework. Both are language independent, allowing REEF to bridge the JVM and. NET.
引用
收藏
页码:1370 / 1373
页数:4
相关论文
共 14 条
[1]  
Alexandrov Alexander, 2011, P 14, P25
[2]  
Ananthanarayanan G., 2012, P 9 USENIX C NETW SY, P20
[3]  
Ananthanarayanan Ganesh, 2012, P 3 ACM S CLOUD COMP
[4]  
BLUMOFE RD, 1995, CILK EFFICIENT MULTI
[5]  
Borkar V, 2011, PROC INT CONF DATA, P1151, DOI 10.1109/ICDE.2011.5767921
[6]  
Chu C., 2007, ADV NEURAL INF PROCE, V19, P281, DOI DOI 10.1234/12345678
[7]  
Hindman B., 2011, P USENIX C NETW SYS, V11, P22
[8]  
Isard M., 2007, ACM SIGOPS OPER SYST, V41, P59, DOI [DOI 10.1145/1272998.1273005, DOI 10.1145/1272996.1273005]
[9]  
Joukov N, 2006, USENIX ASSOCIATION 7TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P89
[10]   The Click modular router [J].
Kohler, E ;
Morris, R ;
Chen, BJ ;
Jannotti, J ;
Kaashoek, MF .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2000, 18 (03) :263-297