Online Lewis Weight Sampling
July 17, 2022 Β· Declared Dead Β· π ACM-SIAM Symposium on Discrete Algorithms
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
David P. Woodruff, Taisuke Yasuda
arXiv ID
2207.08268
Category
cs.DS: Data Structures & Algorithms
Cross-listed
cs.LG,
stat.ML
Citations
24
Venue
ACM-SIAM Symposium on Discrete Algorithms
Last Checked
3 months ago
Abstract
The seminal work of Cohen and Peng introduced Lewis weight sampling to the theoretical computer science community, yielding fast row sampling algorithms for approximating $d$-dimensional subspaces of $\ell_p$ up to $(1+Ξ΅)$ error. Several works have extended this important primitive to other settings, including the online coreset and sliding window models. However, these results are only for $p\in\{1,2\}$, and results for $p=1$ require a suboptimal $\tilde O(d^2/Ξ΅^2)$ samples. In this work, we design the first nearly optimal $\ell_p$ subspace embeddings for all $p\in(0,\infty)$ in the online coreset and sliding window models. In both models, our algorithms store $\tilde O(d^{1\lor(p/2)}/Ξ΅^2)$ rows. This answers a substantial generalization of the main open question of [BDMMUWZ2020], and gives the first results for all $p\notin\{1,2\}$. Towards our result, we give the first analysis of "one-shot'' Lewis weight sampling of sampling rows proportionally to their Lewis weights, with sample complexity $\tilde O(d^{p/2}/Ξ΅^2)$ for $p>2$. Previously, this scheme was only known to have sample complexity $\tilde O(d^{p/2}/Ξ΅^5)$, whereas $\tilde O(d^{p/2}/Ξ΅^2)$ is known if a more sophisticated recursive sampling is used. The recursive sampling cannot be implemented online, thus necessitating an analysis of one-shot Lewis weight sampling. Our analysis uses a novel connection to online numerical linear algebra. As an application, we obtain the first one-pass streaming coreset algorithms for $(1+Ξ΅)$ approximation of important generalized linear models, such as logistic regression and $p$-probit regression. Our upper bounds are parameterized by a complexity parameter $ΞΌ$ introduced by [MSSW2018], and we show the first lower bounds showing that a linear dependence on $ΞΌ$ is necessary.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted