SociaLite: An Efficient Graph Query Language Based on Datalog

被引:19
|
作者
Seo, Jiwon [1 ]
Guo, Stephen [1 ]
Lam, Monica S. [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
Datalog; query languages; graph algorithms; social network analysis; AGGREGATION;
D O I
10.1109/TKDE.2015.2405562
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rise of social networks, large-scale graph analysis becomes increasingly important. Because SQL lacks the expressiveness and performance needed for graph algorithms, lower-level, general-purpose languages are often used instead. For greater ease of use and efficiency, we propose SociaLite, a high-level graph query language based on Datalog. As a logic programming language, Datalog allows many graph algorithms to be expressed succinctly. However, its performance has not been competitive when compared to low-level languages. With SociaLite, users can provide high-level hints on the data layout and evaluation order; they can also define recursive aggregate functions which, as long as they are meet operations, can be evaluated incrementally and efficiently. Moreover, recursive aggregate functions make it possible to implement more graph algorithms that cannot be implemented in Datalog. We evaluated SociaLite by running nine graph algorithms in total; eight for social network analysis (shortest paths, PageRank, hubs and authorities, mutual neighbors, connected components, triangles, clustering coefficients, and betweenness centrality) and one for biological network analysis (Eulerian cycles). We use two real-life social graphs, LiveJournal and Last. fm, for the evaluation as well as one synthetic graph. The optimizations proposed in this paper speed up almost all the algorithms by 3 to 22 times. SociaLite even outperforms typical Java implementations by an average of 50 percent for the graph algorithms tested. When compared to highly optimized Java implementations, SociaLite programs are an order of magnitude more succinct and easier to write. Its performance is competitive, with only 16 percent overhead for the largest benchmark, and 25 percent overhead for the worst case benchmark. Most importantly, being a query language, SociaLite enables many more users who are not proficient in software engineering to perform network analysis easily and efficiently.
引用
收藏
页码:1824 / 1837
页数:14
相关论文
共 50 条
  • [41] Efficient provenance tracking for datalog using top-k queries
    Deutch, Daniel
    Gilad, Amir
    Moskovitch, Yuval
    VLDB JOURNAL, 2018, 27 (02) : 245 - 269
  • [42] Efficient provenance tracking for datalog using top-k queries
    Daniel Deutch
    Amir Gilad
    Yuval Moskovitch
    The VLDB Journal, 2018, 27 : 245 - 269
  • [43] Scalable aggregate keyword query over knowledge graph
    Hu, Xin
    Duan, Jiangli
    Dang, Depeng
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 107 (107): : 588 - 600
  • [44] Fact-checking Vietnamese Information Using Knowledge Graph, Datalog, and KG-BERT
    Duong, Huong T.
    Ho, Van H.
    Phuc Do
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (10)
  • [45] Incrementalizing Lattice-Based Program Analyses in Datalog
    Szabo, Tamas
    Bergmann, Gabor
    Erdweg, Sebastian
    Voelter, Markus
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2018, 2
  • [46] Incrementalizing Lattice-Based Program Analyses in Datalog
    Szabo, Tamas
    Bergmann, Gabor
    Erdweg, Sebastian
    Voelter, Markus
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2018, 2
  • [47] Formulog: Datalog for SMT-Based Static Analysis
    Bembenek, Aaron
    Greenberg, Michael
    Chong, Stephen
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2020, 4 (OOPSLA):
  • [48] Demonstration of LogicLib: An Expressive Multi-Language Interface over Scalable Datalog System
    Li, Mingda
    Wang, Jin
    Xiao, Guorui
    Li, Youfu
    Zaniolo, Carlo
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4917 - 4920
  • [49] RQL: A Query Language for Rule Discovery in Databases
    Chardin, Brice
    Coquery, Emmanuel
    Pailloux, Marie
    Petit, Jean-Marc
    THEORETICAL COMPUTER SCIENCE, 2017, 658 : 357 - 374
  • [50] A structural/temporal query language for Business Processes
    Deutch, Daniel
    Milo, Tova
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2012, 78 (02) : 583 - 609