Edge-Based Wedge Sampling to Estimate Triangle Counts in Very Large Graphs

October 27, 2017 Β· Declared Dead Β· πŸ› Industrial Conference on Data Mining

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Duru Türkoğlu, Ata Turk arXiv ID 1710.09961 Category cs.DS: Data Structures & Algorithms Citations 21 Venue Industrial Conference on Data Mining Last Checked 3 months ago
Abstract
The number of triangles in a graph is useful to deduce a plethora of important features of the network that the graph is modeling. However, finding the exact value of this number is computationally expensive. Hence, a number of approximation algorithms based on random sampling of edges, or wedges (adjacent edge pairs) have been proposed for estimating this value. We argue that for large sparse graphs with power-law degree distribution, random edge sampling requires sampling large number of edges before providing enough information for accurate estimation, and existing wedge sampling methods lead to biased samplings, which in turn lead to less accurate estimations. In this paper, we propose a hybrid algorithm between edge and wedge sampling that addresses the deficiencies of both approaches. We start with uniform edge sampling and then extend each selected edge to form a wedge that is more informative for estimating the overall triangle count. The core estimate we make is the number of triangles each sampled edge in the first phase participates in. This approach provides accurate approximations with very small sampling ratios, outperforming the state-of-the-art up to 8 times in sample size while providing estimations with 95% confidence.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Data Structures & Algorithms

Died the same way β€” πŸ‘» Ghosted