The Power of Uniform Sampling for Coresets
September 05, 2022 Β· Declared Dead Β· π IEEE Annual Symposium on Foundations of Computer Science
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Vladimir Braverman, Vincent Cohen-Addad, Shaofeng H. -C. Jiang, Robert Krauthgamer, Chris Schwiegelshohn, Mads Bech Toftrup, Xuan Wu
arXiv ID
2209.01901
Category
cs.DS: Data Structures & Algorithms
Citations
44
Venue
IEEE Annual Symposium on Foundations of Computer Science
Last Checked
3 months ago
Abstract
Motivated by practical generalizations of the classic $k$-median and $k$-means objectives, such as clustering with size constraints, fair clustering, and Wasserstein barycenter, we introduce a meta-theorem for designing coresets for constrained-clustering problems. The meta-theorem reduces the task of coreset construction to one on a bounded number of ring instances with a much-relaxed additive error. This reduction enables us to construct coresets using uniform sampling, in contrast to the widely-used importance sampling, and consequently we can easily handle constrained objectives. Notably and perhaps surprisingly, this simpler sampling scheme can yield coresets whose size is independent of $n$, the number of input points. Our technique yields smaller coresets, and sometimes the first coresets, for a large number of constrained clustering problems, including capacitated clustering, fair clustering, Euclidean Wasserstein barycenter, clustering in minor-excluded graph, and polygon clustering under FrΓ©chet and Hausdorff distance. Finally, our technique yields also smaller coresets for $1$-median in low-dimensional Euclidean spaces, specifically of size $\tilde{O}(\varepsilon^{-1.5})$ in $\mathbb{R}^2$ and $\tilde{O}(\varepsilon^{-1.6})$ in $\mathbb{R}^3$.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Relief-Based Feature Selection: Introduction and Review
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted