An Empirical Evaluation of $k$-Means Coresets

July 03, 2022 · Declared Dead · 🏛 Embedded Systems and Applications

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Chris Schwiegelshohn, Omar Ali Sheikh-Omar arXiv ID 2207.00966 Category cs.DS: Data Structures & Algorithms Cross-listed cs.LG Citations 12 Venue Embedded Systems and Applications Last Checked 3 months ago

Abstract

Coresets are among the most popular paradigms for summarizing data. In particular, there exist many high performance coresets for clustering problems such as $k$-means in both theory and practice. Curiously, there exists no work on comparing the quality of available $k$-means coresets. In this paper we perform such an evaluation. There currently is no algorithm known to measure the distortion of a candidate coreset. We provide some evidence as to why this might be computationally difficult. To complement this, we propose a benchmark for which we argue that computing coresets is challenging and which also allows us an easy (heuristic) evaluation of coresets. Using this benchmark and real-world data sets, we conduct an exhaustive evaluation of the most commonly used coreset algorithms from theory and practice.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Data Structures & Algorithms

📚 📚 The Cartographer

Relief-Based Feature Selection: Introduction and Review

Ryan J. Urbanowicz, Melissa Meeker, ... (+3 more)

cs.DS 🏛 J.BI 📚 1.1K cites 8 years ago

R.I.P. 👻 Ghosted

Route Planning in Transportation Networks

Hannah Bast, Daniel Delling, ... (+6 more)

cs.DS 🏛 Algorithm Engineering 📚 759 cites 11 years ago

R.I.P. 👻 Ghosted

Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration

Jason Altschuler, Jonathan Weed, Philippe Rigollet

cs.DS 🏛 NeurIPS 📚 654 cites 9 years ago

R.I.P. 👻 Ghosted

Hierarchical Clustering: Objective Functions and Algorithms

Vincent Cohen-Addad, Varun Kanade, ... (+2 more)

cs.DS 🏛 SODA 📚 637 cites 9 years ago

R.I.P. 👻 Ghosted

Graph Isomorphism in Quasipolynomial Time

László Babai

cs.DS 🏛 STOC 📚 616 cites 10 years ago

📚 📚 The Cartographer

Simulation optimization: A review of algorithms and applications

Satyajith Amaran, Nikolaos V. Sahinidis, ... (+2 more)

cs.DS 🏛 4OR 📚 588 cites 8 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Federated Learning: Strategies for Improving Communication Efficiency

Jakub Konečný, H. Brendan McMahan, ... (+4 more)

cs.LG 🏛 arXiv 📚 5.2K cites 9 years ago

R.I.P. 👻 Ghosted

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, Cliff Young, ... (+73 more)

cs.AR 🏛 ISCA 📚 5.1K cites 9 years ago

R.I.P. 👻 Ghosted

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin, Holger R. Roth, ... (+7 more)

cs.CV 🏛 IEEE TMI 📚 4.9K cites 10 years ago

R.I.P. 👻 Ghosted

Explanation in Artificial Intelligence: Insights from the Social Sciences

Tim Miller

cs.AI 🏛 AI 📚 4.9K cites 9 years ago