Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

October 28, 2017 Β· Declared Dead Β· πŸ› Presented at Adaptive Learning Agents workshop (ALA2018), July 14th, 2018, Stockholm, Sweden

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Sergio Valcarcel Macua, Aleksi Tukiainen, Daniel GarcΓ­a-OcaΓ±a HernΓ‘ndez, David Baldazo, Enrique Munoz de Cote, Santiago Zazo arXiv ID 1710.10363 Category cs.LG: Machine Learning Cross-listed cs.MA, math.OC, stat.ML Citations 31 Venue Presented at Adaptive Learning Agents workshop (ALA2018), July 14th, 2018, Stockholm, Sweden Last Checked 3 months ago
Abstract
We propose a fully distributed actor-critic algorithm approximated by deep neural networks, named \textit{Diff-DAC}, with application to single-task and to average multitask reinforcement learning (MRL). Each agent has access to data from its local task only, but it aims to learn a policy that performs well on average for the whole set of tasks. During the learning process, agents communicate their value-policy parameters to their neighbors, diffusing the information across the network, so that they converge to a common policy, with no need for a central node. The method is scalable, since the computational and communication costs per agent grow with its number of neighbors. We derive Diff-DAC's from duality theory and provide novel insights into the standard actor-critic framework, showing that it is actually an instance of the dual ascent method that approximates the solution of a linear program. Experiments suggest that Diff-DAC can outperform the single previous distributed MRL approach (i.e., Dist-MTLPS) and even the centralized architecture.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Machine Learning

Died the same way β€” πŸ‘» Ghosted