Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time
June 23, 2020 Β· Declared Dead Β· π Neural Information Processing Systems
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Jerry Li, Guanghao Ye
arXiv ID
2006.13312
Category
cs.DS: Data Structures & Algorithms
Cross-listed
cs.LG,
math.ST,
stat.ML
Citations
13
Venue
Neural Information Processing Systems
Last Checked
3 months ago
Abstract
Robust covariance estimation is the following, well-studied problem in high dimensional statistics: given $N$ samples from a $d$-dimensional Gaussian $\mathcal{N}(\boldsymbol{0}, Ξ£)$, but where an $\varepsilon$-fraction of the samples have been arbitrarily corrupted, output $\widehatΞ£$ minimizing the total variation distance between $\mathcal{N}(\boldsymbol{0}, Ξ£)$ and $\mathcal{N}(\boldsymbol{0}, \widehatΞ£)$. This corresponds to learning $Ξ£$ in a natural affine-invariant variant of the Frobenius norm known as the \emph{Mahalanobis norm}. Previous work of Cheng et al demonstrated an algorithm that, given $N = Ξ©(d^2 / \varepsilon^2)$ samples, achieved a near-optimal error of $O(\varepsilon \log 1 / \varepsilon)$, and moreover, their algorithm ran in time $\widetilde{O}(T(N, d) \log ΞΊ/ \mathrm{poly} (\varepsilon))$, where $T(N, d)$ is the time it takes to multiply a $d \times N$ matrix by its transpose, and $ΞΊ$ is the condition number of $Ξ£$. When $\varepsilon$ is relatively small, their polynomial dependence on $1/\varepsilon$ in the runtime is prohibitively large. In this paper, we demonstrate a novel algorithm that achieves the same statistical guarantees, but which runs in time $\widetilde{O} (T(N, d) \log ΞΊ)$. In particular, our runtime has no dependence on $\varepsilon$. When $Ξ£$ is reasonably conditioned, our runtime matches that of the fastest algorithm for covariance estimation without outliers, up to poly-logarithmic factors, showing that we can get robustness essentially "for free."
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted