Beating CountSketch for Heavy Hitters in Insertion Streams

November 02, 2015 Β· Declared Dead Β· πŸ› Symposium on the Theory of Computing

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Vladimir Braverman, Stephen R. Chestnut, Nikita Ivkin, David P. Woodruff arXiv ID 1511.00661 Category cs.DS: Data Structures & Algorithms Citations 48 Venue Symposium on the Theory of Computing Last Checked 3 months ago
Abstract
Given a stream $p_1, \ldots, p_m$ of items from a universe $\mathcal{U}$, which, without loss of generality we identify with the set of integers $\{1, 2, \ldots, n\}$, we consider the problem of returning all $\ell_2$-heavy hitters, i.e., those items $j$ for which $f_j \geq Ρ\sqrt{F_2}$, where $f_j$ is the number of occurrences of item $j$ in the stream, and $F_2 = \sum_{i \in [n]} f_i^2$. Such a guarantee is considerably stronger than the $\ell_1$-guarantee, which finds those $j$ for which $f_j \geq Ρm$. In 2002, Charikar, Chen, and Farach-Colton suggested the {\sf CountSketch} data structure, which finds all such $j$ using $Θ(\log^2 n)$ bits of space (for constant $Ρ> 0$). The only known lower bound is $Ω(\log n)$ bits of space, which comes from the need to specify the identities of the items found. In this paper we show it is possible to achieve $O(\log n \log \log n)$ bits of space for this problem. Our techniques, based on Gaussian processes, lead to a number of other new results for data streams, including (1) The first algorithm for estimating $F_2$ simultaneously at all points in a stream using only $O(\log n\log\log n)$ bits of space, improving a natural union bound and the algorithm of Huang, Tai, and Yi (2014). (2) A way to estimate the $\ell_{\infty}$ norm of a stream up to additive error $Ρ\sqrt{F_2}$ with $O(\log n\log\log n)$ bits of space, resolving Open Question 3 from the IITK 2006 list for insertion only streams.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Data Structures & Algorithms

Died the same way β€” πŸ‘» Ghosted