ReSQueing Parallel and Private Stochastic Convex Optimization
January 01, 2023 Β· Declared Dead Β· π IEEE Annual Symposium on Foundations of Computer Science
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Yair Carmon, Arun Jambulapati, Yujia Jin, Yin Tat Lee, Daogao Liu, Aaron Sidford, Kevin Tian
arXiv ID
2301.00457
Category
math.OC: Optimization & Control
Cross-listed
cs.CR,
cs.DS,
cs.LG,
stat.ML
Citations
19
Venue
IEEE Annual Symposium on Foundations of Computer Science
Last Checked
3 months ago
Abstract
We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball oracle acceleration [CJJJLST20, ACJJS21], we develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings. For a SCO objective constrained to the unit ball in $\mathbb{R}^d$, we obtain the following results (up to polylogarithmic factors). We give a parallel algorithm obtaining optimization error $Ξ΅_{\text{opt}}$ with $d^{1/3}Ξ΅_{\text{opt}}^{-2/3}$ gradient oracle query depth and $d^{1/3}Ξ΅_{\text{opt}}^{-2/3} + Ξ΅_{\text{opt}}^{-2}$ gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator. For $Ξ΅_{\text{opt}} \in [d^{-1}, d^{-1/4}]$, our algorithm matches the state-of-the-art oracle depth of [BJLLS19] while maintaining the optimal total work of stochastic gradient descent. Given $n$ samples of Lipschitz loss functions, prior works [BFTT19, BFGT20, AFKT21, KLL21] established that if $n \gtrsim d Ξ΅_{\text{dp}}^{-2}$, $(Ξ΅_{\text{dp}}, Ξ΄)$-differential privacy is attained at no asymptotic cost to the SCO utility. However, these prior works all required a superlinear number of gradient queries. We close this gap for sufficiently large $n \gtrsim d^2 Ξ΅_{\text{dp}}^{-3}$, by using ReSQue to design an algorithm with near-linear gradient query complexity in this regime.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Optimization & Control
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Local SGD Converges Fast and Communicates Little
R.I.P.
π»
Ghosted
On Lazy Training in Differentiable Programming
π
π
The Cartographer
A Review on Bilevel Optimization: From Classical to Evolutionary Approaches and Applications
R.I.P.
π»
Ghosted
Learned Primal-dual Reconstruction
R.I.P.
π»
Ghosted
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted