Towards Instance Optimal Bounds for Best Arm Identification
August 22, 2016 ยท Declared Dead ยท ๐ Annual Conference Computational Learning Theory
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Lijie Chen, Jian Li, Mingda Qiao
arXiv ID
1608.06031
Category
cs.LG: Machine Learning
Cross-listed
cs.DS,
stat.ML
Citations
58
Venue
Annual Conference Computational Learning Theory
Last Checked
3 months ago
Abstract
In the classical best arm identification (Best-$1$-Arm) problem, we are given $n$ stochastic bandit arms, each associated with a reward distribution with an unknown mean. We would like to identify the arm with the largest mean with probability at least $1-ฮด$, using as few samples as possible. Understanding the sample complexity of Best-$1$-Arm has attracted significant attention since the last decade. However, the exact sample complexity of the problem is still unknown. Recently, Chen and Li made the gap-entropy conjecture concerning the instance sample complexity of Best-$1$-Arm. Given an instance $I$, let $ฮผ_{[i]}$ be the $i$th largest mean and $ฮ_{[i]}=ฮผ_{[1]}-ฮผ_{[i]}$ be the corresponding gap. $H(I)=\sum_{i=2}^nฮ_{[i]}^{-2}$ is the complexity of the instance. The gap-entropy conjecture states that $ฮฉ\left(H(I)\cdot\left(\lnฮด^{-1}+\mathsf{Ent}(I)\right)\right)$ is an instance lower bound, where $\mathsf{Ent}(I)$ is an entropy-like term determined by the gaps, and there is a $ฮด$-correct algorithm for Best-$1$-Arm with sample complexity $O\left(H(I)\cdot\left(\lnฮด^{-1}+\mathsf{Ent}(I)\right)+ฮ_{[2]}^{-2}\ln\lnฮ_{[2]}^{-1}\right)$. If the conjecture is true, we would have a complete understanding of the instance-wise sample complexity of Best-$1$-Arm. We make significant progress towards the resolution of the gap-entropy conjecture. For the upper bound, we provide a highly nontrivial algorithm which requires \[O\left(H(I)\cdot\left(\lnฮด^{-1} +\mathsf{Ent}(I)\right)+ฮ_{[2]}^{-2}\ln\lnฮ_{[2]}^{-1}\mathrm{polylog}(n,ฮด^{-1})\right)\] samples in expectation. For the lower bound, we show that for any Gaussian Best-$1$-Arm instance with gaps of the form $2^{-k}$, any $ฮด$-correct monotone algorithm requires $ฮฉ\left(H(I)\cdot\left(\lnฮด^{-1} + \mathsf{Ent}(I)\right)\right)$ samples in expectation.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
R.I.P.
๐ป
Ghosted
Semi-Supervised Classification with Graph Convolutional Networks
R.I.P.
๐ป
Ghosted
Proximal Policy Optimization Algorithms
R.I.P.
๐ป
Ghosted
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
You Only Look Once: Unified, Real-Time Object Detection
R.I.P.
๐ป
Ghosted
A Unified Approach to Interpreting Model Predictions
R.I.P.
๐ป
Ghosted