funcX: A Federated Function Serving Fabric for Science

被引:110
作者
Chard, Ryan [1 ]
Babuji, Yadu [2 ]
Li, Zhuozhao [2 ]
Skluzacek, Tyler [2 ]
Woodard, Anna [2 ]
Blaiszik, Ben [2 ]
Foster, Ian [1 ,2 ]
Chard, Kyle [1 ,2 ]
机构
[1] Argonne Natl Lab, 9700 S Cass Ave, Argonne, IL 60439 USA
[2] Univ Chicago, Chicago, IL 60637 USA
来源
PROCEEDINGS OF THE 29TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, HPDC 2020 | 2020年
关键词
TAXONOMY; SYSTEMS;
D O I
10.1145/3369583.3392683
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Exploding data volumes and velocities, new computational methods and platforms, and ubiquitous connectivity demand new approaches to computation in the sciences. These new approaches must enable computation to be mobile, so that, for example, it can occur near data, be triggered by events (e.g., arrival of new data), be offloaded to specialized accelerators, or run remotely where resources are available. They also require new design approaches in which monolithic applications can be decomposed into smaller components, that may in turn be executed separately and on the most suitable resources. To address these needs we present funcX-a distributed function as a service (FaaS) platform that enables flexible, scalable, and high performance remote function execution. funcX's endpoint software can transform existing clouds, clusters, and supercomputers into function serving systems, while funcX's cloud-hosted service provides transparent, secure, and reliable function execution across a federated ecosystem of endpoints. We motivate the need for funcX with several scientific case studies, present our prototype design and implementation, show optimizations that deliver throughput in excess of 1 million functions per second, and demonstrate, via experiments on two supercomputers, that funcX can scale to more than more than 130 000 concurrent workers.
引用
收藏
页码:65 / 76
页数:12
相关论文
共 52 条
[1]  
Akkus IE, 2018, PROCEEDINGS OF THE 2018 USENIX ANNUAL TECHNICAL CONFERENCE, P923
[2]  
amazon, Amazon Lambda
[3]  
amazon, AWS Greengrass
[4]  
[Anonymous], 2016, Apache Hadoop
[5]  
[Anonymous], 2019, Apache Spark
[6]  
apache, Apache OpenWhisk
[7]   Parsl: Pervasive Parallel Programming in Python']Python [J].
Babuji, Yadu ;
Woodard, Anna ;
Li, Zhuozhao ;
Katz, Daniel S. ;
Clifford, Ben ;
Kumar, Rohan ;
Lacinski, Lukasz ;
Chard, Ryan ;
Wozniak, Justin M. ;
Foster, Ian ;
Wilde, Michael ;
Chard, Kyle .
HPDC'19: PROCEEDINGS OF THE 28TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, 2019, :25-36
[8]  
Baldini I, 2016, 2016 IEEE/ACM INTERNATIONAL CONFERENCE ON MOBILE SOFTWARE ENGINEERING AND SYSTEMS (MOBILESOFT 2016), P287, DOI [10.1145/2897073.2897713, 10.1109/MobileSoft.2016.063]
[9]  
Fox GC, 2017, Arxiv, DOI arXiv:1708.08028
[10]   Efficient and Secure Transfer, Synchronization, and Sharing of Big Data [J].
Chard, Kyle ;
Tuecke, Steven ;
Foster, Ian .
IEEE CLOUD COMPUTING, 2014, 1 (03) :46-55