Near-Optimal Statistical Query Hardness of Learning Halfspaces with Massart Noise
December 17, 2020 ยท Declared Dead ยท ๐ Annual Conference Computational Learning Theory
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Ilias Diakonikolas, Daniel M. Kane
arXiv ID
2012.09720
Category
cs.LG: Machine Learning
Cross-listed
cs.CC,
math.ST,
stat.ML
Citations
28
Venue
Annual Conference Computational Learning Theory
Last Checked
3 months ago
Abstract
We study the problem of PAC learning halfspaces with Massart noise. Given labeled samples $(x, y)$ from a distribution $D$ on $\mathbb{R}^{d} \times \{ \pm 1\}$ such that the marginal $D_x$ on the examples is arbitrary and the label $y$ of example $x$ is generated from the target halfspace corrupted by a Massart adversary with flipping probability $ฮท(x) \leq ฮท\leq 1/2$, the goal is to compute a hypothesis with small misclassification error. The best known $\mathrm{poly}(d, 1/ฮต)$-time algorithms for this problem achieve error of $ฮท+ฮต$, which can be far from the optimal bound of $\mathrm{OPT}+ฮต$, where $\mathrm{OPT} = \mathbf{E}_{x \sim D_x} [ฮท(x)]$. While it is known that achieving $\mathrm{OPT}+o(1)$ error requires super-polynomial time in the Statistical Query model, a large gap remains between known upper and lower bounds. In this work, we essentially characterize the efficient learnability of Massart halfspaces in the Statistical Query (SQ) model. Specifically, we show that no efficient SQ algorithm for learning Massart halfspaces on $\mathbb{R}^d$ can achieve error better than $ฮฉ(ฮท)$, even if $\mathrm{OPT} = 2^{-\log^{c} (d)}$, for any universal constant $c \in (0, 1)$. Furthermore, when the noise upper bound $ฮท$ is close to $1/2$, our error lower bound becomes $ฮท- o_ฮท(1)$, where the $o_ฮท(1)$ term goes to $0$ when $ฮท$ approaches $1/2$. Our results provide strong evidence that known learning algorithms for Massart halfspaces are nearly best possible, thereby resolving a longstanding open problem in learning theory.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
R.I.P.
๐ป
Ghosted
Semi-Supervised Classification with Graph Convolutional Networks
R.I.P.
๐ป
Ghosted
Proximal Policy Optimization Algorithms
R.I.P.
๐ป
Ghosted
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
You Only Look Once: Unified, Real-Time Object Detection
R.I.P.
๐ป
Ghosted
A Unified Approach to Interpreting Model Predictions
R.I.P.
๐ป
Ghosted