The Computational Power of Optimization in Online Learning

April 08, 2015 · Declared Dead · 🏛 Symposium on the Theory of Computing

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Elad Hazan, Tomer Koren arXiv ID 1504.02089 Category cs.LG: Machine Learning Cross-listed cs.GT Citations 72 Venue Symposium on the Theory of Computing Last Checked 3 months ago

Abstract

We consider the fundamental problem of prediction with expert advice where the experts are "optimizable": there is a black-box optimization oracle that can be used to compute, in constant time, the leading expert in retrospect at any point in time. In this setting, we give a novel online algorithm that attains vanishing regret with respect to $N$ experts in total $\widetilde{O}(\sqrt{N})$ computation time. We also give a lower bound showing that this running time cannot be improved (up to log factors) in the oracle model, thereby exhibiting a quadratic speedup as compared to the standard, oracle-free setting where the required time for vanishing regret is $\widetildeΘ(N)$. These results demonstrate an exponential gap between the power of optimization in online learning and its power in statistical learning: in the latter, an optimization oracle---i.e., an efficient empirical risk minimizer---allows to learn a finite hypothesis class of size $N$ in time $O(\log{N})$. We also study the implications of our results to learning in repeated zero-sum games, in a setting where the players have access to oracles that compute, in constant time, their best-response to any mixed strategy of their opponent. We show that the runtime required for approximating the minimax value of the game in this setting is $\widetildeΘ(\sqrt{N})$, yielding again a quadratic improvement upon the oracle-free setting, where $\widetildeΘ(N)$ is known to be tight.