Optimal streaming and tracking distinct elements with high probability
April 05, 2018 Β· Declared Dead Β· π ACM-SIAM Symposium on Discrete Algorithms
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
JarosΕaw BΕasiok
arXiv ID
1804.01642
Category
cs.DS: Data Structures & Algorithms
Citations
47
Venue
ACM-SIAM Symposium on Discrete Algorithms
Last Checked
3 months ago
Abstract
The distinct elements problem is one of the fundamental problems in streaming algorithms --- given a stream of integers in the range $\{1,\ldots,n\}$, we wish to provide a $(1+\varepsilon)$ approximation to the number of distinct elements in the input. After a long line of research an optimal solution for this problem with constant probability of success, using $\mathcal{O}(\frac{1}{\varepsilon^2}+\log n)$ bits of space, was given by Kane, Nelson and Woodruff in 2010. The standard approach used in order to achieve low failure probability $Ξ΄$ is to take the median of $\log Ξ΄^{-1}$ parallel repetitions of the original algorithm. We show that such a multiplicative space blow-up is unnecessary: we provide an optimal algorithm using $\mathcal{O}(\frac{\log Ξ΄^{-1}}{\varepsilon^2} + \log n)$ bits of space --- matching known lower bounds for this problem. That is, the $\logΞ΄^{-1}$ factor does not multiply the $\log n$ term. This settles completely the space complexity of the distinct elements problem with respect to all standard parameters. We consider also the \emph{strong tracking} (or \emph{continuous monitoring}) variant of the distinct elements problem, where we want an algorithm which provides an approximation of the number of distinct elements seen so far, at all times of the stream. We show that this variant can be solved using $\mathcal{O}(\frac{\log \log n + \log Ξ΄^{-1}}{\varepsilon^2} + \log n)$ bits of space, which we show to be optimal.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Relief-Based Feature Selection: Introduction and Review
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted