Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models

December 14, 2020 ยท Declared Dead ยท ๐Ÿ› IEEE Annual Symposium on Foundations of Computer Science

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Ilias Diakonikolas, Daniel M. Kane arXiv ID 2012.07774 Category cs.LG: Machine Learning Cross-listed cs.CC, cs.DS, math.AG, math.ST Citations 33 Venue IEEE Annual Symposium on Foundations of Computer Science Last Checked 3 months ago
Abstract
Let $V$ be any vector space of multivariate degree-$d$ homogeneous polynomials with co-dimension at most $k$, and $S$ be the set of points where all polynomials in $V$ {\em nearly} vanish. We establish a qualitatively optimal upper bound on the size of $ฮต$-covers for $S$, in the $\ell_2$-norm. Roughly speaking, we show that there exists an $ฮต$-cover for $S$ of cardinality $M = (k/ฮต)^{O_d(k^{1/d})}$. Our result is constructive yielding an algorithm to compute such an $ฮต$-cover that runs in time $\mathrm{poly}(M)$. Building on our structural result, we obtain significantly improved learning algorithms for several fundamental high-dimensional probabilistic models with hidden variables. These include density and parameter estimation for $k$-mixtures of spherical Gaussians (with known common covariance), PAC learning one-hidden-layer ReLU networks with $k$ hidden units (under the Gaussian distribution), density and parameter estimation for $k$-mixtures of linear regressions (with Gaussian covariates), and parameter estimation for $k$-mixtures of hyperplanes. Our algorithms run in time {\em quasi-polynomial} in the parameter $k$. Previous algorithms for these problems had running times exponential in $k^{ฮฉ(1)}$. At a high-level our algorithms for all these learning problems work as follows: By computing the low-degree moments of the hidden parameters, we are able to find a vector space of polynomials that nearly vanish on the unknown parameters. Our structural result allows us to compute a quasi-polynomial sized cover for the set of hidden parameters, which we exploit in our learning algorithms.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning

Died the same way โ€” ๐Ÿ‘ป Ghosted