On efficient robust regression with subquadratic samples

May 18, 2026 Β· Grace Period Β· πŸ› COLT 2026

⏳ Grace Period
This paper is less than 90 days old. We give authors time to release their code before passing judgment.
Authors Deeksha Adil, JarosΕ‚aw BΕ‚asiok, Hongjie Chen, Deepak Narayanan Sridharan arXiv ID 2605.18042 Category cs.DS: Data Structures & Algorithms Cross-listed stat.ML Citations 0 Venue COLT 2026
Abstract
We revisit the problem of robust linear regression under Gaussian covariates with an unknown covariance matrix of condition number $κ$. For this fundamental problem, significant gaps remain in our understanding of the trade-offs among sample complexity, condition number, runtime, and prediction error for efficient algorithms. Our first result is a near-linear-time algorithm that uses $\widetilde{O}(d/Ρ^4)$ samples, where $d$ is the dimension and $Ρ$ is the corruption rate, and achieves prediction error $O(\sqrt{Ρκ})$ under the condition $Ρκ\lesssim 1$, improving over all prior works. We complement this result with a Statistical Query (SQ) lower bound showing that efficient SQ algorithms achieving error $o(\sqrt{Ρκ})$ when $Ρκ\lesssim 1$ require queries that take $Ω(d^2)$ samples to simulate. Finally, we prove a low-degree polynomial lower bound that gives fine-grained evidence that, without assumptions such as $Ρκ\lesssim 1$, efficient algorithms may require $\tildeΩ\left(\min\{dΡ^{2}κ^{2},\ Ρ^{2}d^{2}\}\right)$ samples to significantly outperform the trivial estimator that always guesses $0$.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Data Structures & Algorithms