SSumM: Sparse Summarization of Massive Graphs
June 01, 2020 Β· Declared Dead Β· π Knowledge Discovery and Data Mining
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Kyuhan Lee, Hyeonsoo Jo, Jihoon Ko, Sungsu Lim, Kijung Shin
arXiv ID
2006.01060
Category
cs.DB: Databases
Cross-listed
cs.SI
Citations
47
Venue
Knowledge Discovery and Data Mining
Last Checked
4 months ago
Abstract
Given a graph G and the desired size k in bits, how can we summarize G within k bits, while minimizing the information loss? Large-scale graphs have become omnipresent, posing considerable computational challenges. Analyzing such large graphs can be fast and easy if they are compressed sufficiently to fit in main memory or even cache. Graph summarization, which yields a coarse-grained summary graph with merged nodes, stands out with several advantages among graph compression techniques. Thus, a number of algorithms have been developed for obtaining a concise summary graph with little information loss or equivalently small reconstruction error. However, the existing methods focus solely on reducing the number of nodes, and they often yield dense summary graphs, failing to achieve better compression rates. Moreover, due to their limited scalability, they can be applied only to moderate-size graphs. In this work, we propose SSumM, a scalable and effective graph-summarization algorithm that yields a sparse summary graph. SSumM not only merges nodes together but also sparsifies the summary graph, and the two strategies are carefully balanced based on the minimum description length principle. Compared with state-of-the-art competitors, SSumM is (a) Concise: yields up to 11.2X smaller summary graphs with similar reconstruction error, (b) Accurate: achieves up to 4.2X smaller reconstruction error with similarly concise outputs, and (c) Scalable: summarizes 26X larger graphs while exhibiting linear scalability. We validate these advantages through extensive experiments on 10 real-world graphs.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Databases
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Untangling Blockchain: A Data Processing View of Blockchain Systems
R.I.P.
π»
Ghosted
Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades
R.I.P.
π»
Ghosted
BLOCKBENCH: A Framework for Analyzing Private Blockchains
R.I.P.
π»
Ghosted
Data Synthesis based on Generative Adversarial Networks
R.I.P.
π»
Ghosted
HoloClean: Holistic Data Repairs with Probabilistic Inference
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted