Flan: An Expressive and Efficient Datalog Compiler for Program Analysis

被引:1
作者
Abeysinghe, Supun [1 ]
Xhebraj, Anxhelo [1 ]
Rompf, Tiark [1 ]
机构
[1] Purdue Univ, Dept Comp Sci, 610 Purdue Mall, W Lafayette, IN 47907 USA
来源
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL | 2024年 / 8卷 / POPL期
关键词
Datalog; Logic Programming; Generative Programming; Program Analysis; QUERY PLANS;
D O I
10.1145/3632928
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Datalog has gained prominence in program analysis due to its expressiveness and ease of use. Its generic fixpoint resolution algorithm over relational domains simplifies the expression of many complex analyses. The performance and scalability issues of early Datalog approaches have been addressed by tools such as Souffle through specialized code generation. Still, while pure Datalog is expressive enough to support a wide range of analyses, there is a growing need for extensions to accommodate increasingly complex analyses. This has led to the development of various extensions, such as Flix, Datafun, and Formulog, which enhance Datalog with features like arbitrary lattices and SMT constraints. Most of these extensions recognize the need for full interoperability between Datalog and a full-fledged programming language, a functionality that high-performance systems like Souffle lack. Speciflcally, in most cases, they construct languages from scratch with first-class Datalog support, allowing greater flexibility. However, this flexibility often comes at the cost of performance due to the conflicting requirements of prioritizing modularity and abstraction over efficiency. Consequently, achieving both flexibility and compilation to highly-performant specialized code poses a significant challenge. In this work, we reconcile the competing demands of expressiveness and performance with Flan, a Datalog compiler fully embedded in Scala that leverages multi-stage programming to generate specialized code for enhanced performance. Our approach combines the flexibility of Flix with Souffle's performance, offering seamless integration with the host language that enables the addition of powerful extensions while generating specialized code for the entire computation. Flan's simple operator interface allows the addition of an extensive set of features, including arbitrary aggregates, user-defined functions, and lattices, with multiple execution strategies such as binary and multi-way joins, supported by difierent indexing structures like specialized trees and hash tables, with minimal effort. We evaluate our system on a variety of benchmarks and compare it to established Datalog engines. Our results demonstrate competitive performance and speedups in the range of 1.4x to 12.5x compared to state-of-the-art systems for workloads of practical importance.
引用
收藏
页码:2577 / 2609
页数:33
相关论文
共 50 条
  • [1] Using Datalog and Boolean Equation Systems for Program Analysis
    Alpuente, Maria
    Feliu, Marco A.
    Joubert, Christophe
    Villanueva, Alicia
    FORMAL METHODS FOR INDUSTRIAL CRITICAL SYSTEMS, 2009, 5596 : 215 - 231
  • [2] The Complexity and Expressive Power of Limit Datalog
    Kaminski, Mark
    Kostylev, Egor, V
    Grau, Bernardo Cuenca
    Motik, Boris
    Horrocks, Ian
    JOURNAL OF THE ACM, 2022, 69 (01)
  • [3] The Expressive Power of Higher-Order Datalog
    Charalambidis, Angelos
    Nomikos, Christos
    Rondogiannis, Panos
    THEORY AND PRACTICE OF LOGIC PROGRAMMING, 2019, 19 (5-6) : 925 - 940
  • [4] On Fast Large-Scale Program Analysis in Datalog
    Scholz, Bernhard
    Jordan, Herbert
    Subotic, Pavle
    Westmann, Till
    PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION (CC 2016), 2016, : 196 - 206
  • [5] Incremental Whole-Program Analysis in Datalog with Lattices
    Szabo, Tamas
    Erdweg, Sebastian
    Bergmann, Gabor
    PROCEEDINGS OF THE 42ND ACM SIGPLAN INTERNATIONAL CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '21), 2021, : 1 - 15
  • [6] A Datalog Source-To-Source Translator for Static Program Analysis: An Experience Report
    Vorobyov, Bernhard Scholz Kostyantyn
    Krishnan, Padmanabhan
    Westmann, Till
    2015 24TH AUSTRALASIAN SOFTWARE ENGINEERING CONFERENCE (ASWEC 2015), 2015, : 28 - 37
  • [7] Isabelle-verified correctness of Datalog programs for program analysis
    Schlichtkrull, Anders
    Hansen, Rene Rydhof
    Nielson, Flemming
    39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 1731 - 1732
  • [8] Program Repair Guided by Datalog-Defined Static Analysis
    Liu, Yu
    Mechtaev, Sergey
    Subotic, Pavle
    Roychoudhury, Abhik
    PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 1216 - 1228
  • [9] A Typed Multi-level Datalog IR and Its Compiler Framework
    Klopp, David
    Erdweg, Sebastian
    Pacak, Andre
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2024, 8 (OOPSLA2):
  • [10] DATALOG_SOLVE: A Datalog-Based Demand-Driven Program Analyzer
    Alpuente, M.
    Feliu, M. A.
    Joubert, C.
    Villanueva, A.
    ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2009, 248 : 57 - 66