On Coresets for Logistic Regression
May 22, 2018 ยท Declared Dead ยท ๐ Neural Information Processing Systems
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Alexander Munteanu, Chris Schwiegelshohn, Christian Sohler, David P. Woodruff
arXiv ID
1805.08571
Category
cs.DS: Data Structures & Algorithms
Cross-listed
cs.LG,
stat.ML
Citations
118
Venue
Neural Information Processing Systems
Last Checked
1 month ago
Abstract
Coresets are one of the central methods to facilitate the analysis of large data sets. We continue a recent line of research applying the theory of coresets to logistic regression. First, we show a negative result, namely, that no strongly sublinear sized coresets exist for logistic regression. To deal with intractable worst-case instances we introduce a complexity measure $ฮผ(X)$, which quantifies the hardness of compressing a data set for logistic regression. $ฮผ(X)$ has an intuitive statistical interpretation that may be of independent interest. For data sets with bounded $ฮผ(X)$-complexity, we show that a novel sensitivity sampling scheme produces the first provably sublinear $(1\pm\varepsilon)$-coreset. We illustrate the performance of our method by comparing to uniform sampling as well as to state of the art methods in the area. The experiments are conducted on real world benchmark data for logistic regression.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Data Structures & Algorithms
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
Relief-Based Feature Selection: Introduction and Review
R.I.P.
๐ป
Ghosted
Route Planning in Transportation Networks
R.I.P.
๐ป
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
๐ป
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
๐ป
Ghosted
Graph Isomorphism in Quasipolynomial Time
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted