Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time

June 23, 2020 · Declared Dead · 🏛 Neural Information Processing Systems

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Jerry Li, Guanghao Ye arXiv ID 2006.13312 Category cs.DS: Data Structures & Algorithms Cross-listed cs.LG, math.ST, stat.ML Citations 13 Venue Neural Information Processing Systems Last Checked 3 months ago

Abstract

Robust covariance estimation is the following, well-studied problem in high dimensional statistics: given $N$ samples from a $d$-dimensional Gaussian $\mathcal{N}(\boldsymbol{0}, Σ)$, but where an $\varepsilon$-fraction of the samples have been arbitrarily corrupted, output $\widehatΣ$ minimizing the total variation distance between $\mathcal{N}(\boldsymbol{0}, Σ)$ and $\mathcal{N}(\boldsymbol{0}, \widehatΣ)$. This corresponds to learning $Σ$ in a natural affine-invariant variant of the Frobenius norm known as the \emph{Mahalanobis norm}. Previous work of Cheng et al demonstrated an algorithm that, given $N = Ω(d^2 / \varepsilon^2)$ samples, achieved a near-optimal error of $O(\varepsilon \log 1 / \varepsilon)$, and moreover, their algorithm ran in time $\widetilde{O}(T(N, d) \log κ/ \mathrm{poly} (\varepsilon))$, where $T(N, d)$ is the time it takes to multiply a $d \times N$ matrix by its transpose, and $κ$ is the condition number of $Σ$. When $\varepsilon$ is relatively small, their polynomial dependence on $1/\varepsilon$ in the runtime is prohibitively large. In this paper, we demonstrate a novel algorithm that achieves the same statistical guarantees, but which runs in time $\widetilde{O} (T(N, d) \log κ)$. In particular, our runtime has no dependence on $\varepsilon$. When $Σ$ is reasonably conditioned, our runtime matches that of the fastest algorithm for covariance estimation without outliers, up to poly-logarithmic factors, showing that we can get robustness essentially "for free."