Cut to Fit: Tailoring the Partitioning to the Computation
April 20, 2018 Β· Declared Dead Β· π GRADES/NDA@SIGMOD/PODS
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Iacovos Kolokasis, Polyvios Pratikakis
arXiv ID
1804.07747
Category
cs.DC: Distributed Computing
Citations
1
Venue
GRADES/NDA@SIGMOD/PODS
Last Checked
3 months ago
Abstract
Social Graph Analytics applications are very often built using off-the-shelf analytics frameworks. These, however, are profiled and optimized for the general case and have to perform for all kinds of graphs. This paper investigates how knowledge of the application and the dataset can help optimize performance with minimal effort. We concentrate on the impact of partitioning strategies on the performance of computations on social graphs. We evaluate six graph partitioning algorithms on a set of six social graphs, using four standard graph algorithms by measuring a set of five partitioning metrics. We analyze the performance of each partitioning strategy with respect to (i) the properties of the graph dataset, (ii) each analytics computation,of partitions. We discover that communication cost is the best predictor of performance for most -but not all- analytics computations. We also find that the best partitioning strategy for a particular kind of algorithm may not be the best for another, and that optimizing for the general case of all algorithms may not select the optimal partitioning strategy for a given graph algorithm. We conclude with insights on selecting the right data partitioning strategy, which has significant impact on the performance of large graph analytics computations; certainly enough to warrant optimization of the partitioning strategy to the computation and to the dataset.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Distributed Computing
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
R.I.P.
π»
Ghosted
Hyperledger Fabric: A Distributed Operating System for Permissioned Blockchains
R.I.P.
π»
Ghosted
Reproducing GW150914: the first observation of gravitational waves from a binary black hole merger
R.I.P.
π»
Ghosted
MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
R.I.P.
π»
Ghosted
Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted