Arc: An IR for Batch and Stream Programming

被引:5
|
作者
Kroll, Lars [1 ]
Segeljakt, Klas [1 ]
Carbone, Paris [2 ]
Schulte, Christian [1 ]
Haridi, Seif [1 ]
机构
[1] KTH Royal Inst Technol, Stockholm, Sweden
[2] RISE SICS, Stockholm, Sweden
来源
PROCEEDINGS OF THE 17TH ACM SIGPLAN INTERNATIONAL SYMPOSIUM ON DATABASE PROGRAMMING LANGUAGES (DBPL '19) | 2019年
关键词
stream processing; intermediate representation; data analytics;
D O I
10.1145/3315507.3330199
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In big data analytics, there is currently a large number of data programming models and their respective frontends such as relational tables, graphs, tensors, and streams. This has lead to a plethora of runtimes that typically focus on the efficient execution of just a single frontend. This fragmentation manifests itself today by highly complex pipelines that bundle multiple runtimes to support the necessary models. Hence, joint optimization and execution of such pipelines across these frontend-bound runtimes is infeasible. We propose Arc as the first unified Intermediate Representation (IR) for data analytics that incorporates stream semantics based on a modern specification of streams, windows and stream aggregation, to combine batch and stream computation models. Arc extends Weld, an IR for batch computation and adds support for partitioned, out-of-order stream and window operators which are the most fundamental building blocks in contemporary data streaming.
引用
收藏
页码:53 / 58
页数:6
相关论文
共 15 条
  • [1] Cyclone: Unified Stream and Batch Processing
    Harvan, Matus
    Locher, Thomas
    Sima, Ana Claudia
    PROCEEDINGS OF 45TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPPW 2016), 2016, : 220 - 229
  • [2] On Atomic Batch Executions in Stream Processing
    Vidyasankar, K.
    7TH INTERNATIONAL CONFERENCE ON EMERGING UBIQUITOUS SYSTEMS AND PERVASIVE NETWORKS (EUSPN 2016)/THE 6TH INTERNATIONAL CONFERENCE ON CURRENT AND FUTURE TRENDS OF INFORMATION AND COMMUNICATION TECHNOLOGIES IN HEALTHCARE (ICTH-2016), 2016, 98 : 72 - 79
  • [3] RStream: Simple and Efficient Batch and Stream Processing at Scale
    Fino, Alessio
    Margara, Alessandro
    Cugola, Gianpaolo
    Donadoni, Marco
    Morassutto, Edoardo
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 2764 - 2774
  • [4] From Batch to Stream: Automatic Generation of Online Algorithms
    Wang, Ziteng
    Pailoor, Shankara
    Prakash, Aaryan
    Wang, Yuepeng
    Dillig, Isil
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2024, 8 (PLDI):
  • [5] An Open-Source Framework Unifying Stream and Batch Processing
    Deshpande, Kiran
    Rao, Madhuri
    INVENTIVE COMPUTATION AND INFORMATION TECHNOLOGIES, ICICIT 2021, 2022, 336 : 607 - 630
  • [6] SPOC: GPGPU PROGRAMMING THROUGH STREAM PROCESSING WITH OCAML
    Bourgoin, Mathias
    Chailloux, Emmanuel
    Lamotte, Jean-Luc
    PARALLEL PROCESSING LETTERS, 2012, 22 (02)
  • [7] A hybrid distributed batch-stream processing approach for anomaly detection
    Pishgoo, Boshra
    Azirani, Ahmad Akbari
    Raahemi, Bijan
    INFORMATION SCIENCES, 2021, 543 : 309 - 327
  • [8] ab-Stream: A Framework for programming Many-core
    Gan, Xinbiao
    Wang, Zhiying
    Shen, Li
    Zhu, Qi
    PRZEGLAD ELEKTROTECHNICZNY, 2012, 88 (7B): : 341 - 344
  • [9] StreamFlex: High-throughput Stream Programming in Java']Java
    Spring, Jesper H.
    Privat, Jean
    Guerraoui, Rachid
    Vitek, Jan
    OOPSLA: 22ND INTERNATIONAL CONFERENCE ON OBJECT-ORIENTED PROGRAMMING, SYSTEMS, LANGUAGES, AND APPLICATIONS, PROCEEDINGS, 2007, : 211 - 228
  • [10] StreamFlex: High-throughput stream programming in Java']Java
    Spring, Jesper H.
    Privat, Jean
    Guerraoui, Rachid
    Vitek, Jan
    ACM SIGPLAN NOTICES, 2007, 42 (10) : 211 - 228