High probability generalization bounds for uniformly stable algorithms with nearly optimal rate

February 27, 2019 ยท Declared Dead ยท ๐Ÿ› Annual Conference Computational Learning Theory

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Vitaly Feldman, Jan Vondrak arXiv ID 1902.10710 Category cs.LG: Machine Learning Cross-listed cs.DS, stat.ML Citations 170 Venue Annual Conference Computational Learning Theory Last Checked 1 month ago
Abstract
Algorithmic stability is a classical approach to understanding and analysis of the generalization error of learning algorithms. A notable weakness of most stability-based generalization bounds is that they hold only in expectation. Generalization with high probability has been established in a landmark paper of Bousquet and Elisseeff (2002) albeit at the expense of an additional $\sqrt{n}$ factor in the bound. Specifically, their bound on the estimation error of any $ฮณ$-uniformly stable learning algorithm on $n$ samples and range in $[0,1]$ is $O(ฮณ\sqrt{n \log(1/ฮด)} + \sqrt{\log(1/ฮด)/n})$ with probability $\geq 1-ฮด$. The $\sqrt{n}$ overhead makes the bound vacuous in the common settings where $ฮณ\geq 1/\sqrt{n}$. A stronger bound was recently proved by the authors (Feldman and Vondrak, 2018) that reduces the overhead to at most $O(n^{1/4})$. Still, both of these results give optimal generalization bounds only when $ฮณ= O(1/n)$. We prove a nearly tight bound of $O(ฮณ\log(n)\log(n/ฮด) + \sqrt{\log(1/ฮด)/n})$ on the estimation error of any $ฮณ$-uniformly stable algorithm. It implies that for algorithms that are uniformly stable with $ฮณ= O(1/\sqrt{n})$, estimation error is essentially the same as the sampling error. Our result leads to the first high-probability generalization bounds for multi-pass stochastic gradient descent and regularized ERM for stochastic convex problems with nearly optimal rate --- resolving open problems in prior work. Our proof technique is new and we introduce several analysis tools that might find additional applications.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning

Died the same way โ€” ๐Ÿ‘ป Ghosted