Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

November 18, 2016 · Entered Twilight · 🏛 International Conference on Learning Representations

"Last commit was 7.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: LICENSE, README.md, ga3c

Authors Mohammad Babaeizadeh, Iuri Frosio, Stephen Tyree, Jason Clemons, Jan Kautz arXiv ID 1611.06256 Category cs.LG: Machine Learning Citations 291 Venue International Conference on Learning Representations Repository https://github.com/NVlabs/GA3C ⭐ 661 Last Checked 1 month ago

Abstract

We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks. We analyze its computational traits and concentrate on aspects critical to leveraging the GPU's computational power. We introduce a system of queues and a dynamic scheduling strategy, potentially helpful for other asynchronous algorithms as well. Our hybrid CPU/GPU version of A3C, based on TensorFlow, achieves a significant speed up compared to a CPU implementation; we make it publicly available to other researchers at https://github.com/NVlabs/GA3C .