Perfect $L_p$ Sampling in a Data Stream

August 16, 2018 Β· Declared Dead Β· πŸ› IEEE Annual Symposium on Foundations of Computer Science

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Rajesh Jayaram, David P. Woodruff arXiv ID 1808.05497 Category cs.DS: Data Structures & Algorithms Citations 45 Venue IEEE Annual Symposium on Foundations of Computer Science Last Checked 3 months ago
Abstract
In this paper, we resolve the one-pass space complexity of $L_p$ sampling for $p \in (0,2)$. Given a stream of updates (insertions and deletions) to the coordinates of an underlying vector $f \in \mathbb{R}^n$, a perfect $L_p$ sampler must output an index $i$ with probability $|f_i|^p/\|f\|_p^p$, and is allowed to fail with some probability $δ$. So far, for $p > 0$ no algorithm has been shown to solve the problem exactly using $\text{poly}( \log n)$-bits of space. In 2010, Monemizadeh and Woodruff introduced an approximate $L_p$ sampler, which outputs $i$ with probability $(1 \pm ν)|f_i|^p /\|f\|_p^p$, using space polynomial in $ν^{-1}$ and $\log(n)$. The space complexity was later reduced by Jowhari, Sağlam, and Tardos to roughly $O(ν^{-p} \log^2 n \log δ^{-1})$ for $p \in (0,2)$, which tightly matches the $Ω(\log^2 n \log δ^{-1})$ lower bound in terms of $n$ and $δ$, but is loose in terms of $ν$. Given these nearly tight bounds, it is perhaps surprising that no lower bound exists in terms of $ν$---not even a bound of $Ω(ν^{-1})$ is known. In this paper, we explain this phenomenon by demonstrating the existence of an $O(\log^2 n \log δ^{-1})$-bit perfect $L_p$ sampler for $p \in (0,2)$. This shows that $ν$ need not factor into the space of an $L_p$ sampler, which closes the complexity of the problem for this range of $p$. For $p=2$, our bound is $O(\log^3 n \log δ^{-1})$-bits, which matches the prior best known upper bound in terms of $n,δ$, but has no dependence on $ν$. For $p<2$, our bound holds in the random oracle model, matching the lower bounds in that model. Moreover, we show that our algorithm can be derandomized with only a $O((\log \log n)^2)$ blow-up in the space (and no blow-up for $p=2$). Our derandomization technique is general, and can be used to derandomize a large class of linear sketches.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Data Structures & Algorithms

Died the same way β€” πŸ‘» Ghosted