Scaling-Up In-Memory Datalog Processing: Observations and Techniques
December 10, 2018 ยท Declared Dead ยท ๐ Proceedings of the VLDB Endowment
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Zhiwei Fan, Jianqiao Zhu, Zuyu Zhang, Aws Albarghouthi, Paraschos Koutris, Jignesh Patel
arXiv ID
1812.03975
Category
cs.DB: Databases
Citations
31
Venue
Proceedings of the VLDB Endowment
Last Checked
3 months ago
Abstract
Recursive query processing has experienced a recent resurgence, as a result of its use in many modern application domains, including data integration, graph analytics, security, program analysis, networking and decision making. Due to the large volumes of data being processed, several research efforts, across multiple communities, have explored how to scale up recursive queries, typically expressed in Datalog. Our experience with these tools indicated that their performance does not translate across domains (e.g., a tool design for large-scale graph analytics does not exhibit the same performance on program-analysis tasks, and vice versa). As a result, we designed and implemented a general-purpose Datalog engine, called RecStep, on top of a parallel single-node relational system. In this paper, we outline the different techniques we use in RecStep, and the contribution of each technique to overall performance. We also present results from a detailed set of experiments comparing RecStep with a number of other Datalog systems using both graph analytics and program-analysis tasks, summarizing pros and cons of existing techniques based on the analysis of our observations. We show that RecStep generally outperforms the state-of-the-art parallel Datalog engines on complex and large-scale Datalog program evaluation, by a 4-6X margin. An additional insight from our work is that we show that it is possible to build a high-performance Datalog system on top of a relational engine, an idea that has been dismissed in past work in this area.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Databases
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
The Case for Learned Index Structures
R.I.P.
๐ป
Ghosted
Untangling Blockchain: A Data Processing View of Blockchain Systems
R.I.P.
๐ป
Ghosted
Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades
R.I.P.
๐ป
Ghosted
BLOCKBENCH: A Framework for Analyzing Private Blockchains
R.I.P.
๐ป
Ghosted
Data Synthesis based on Generative Adversarial Networks
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted