MAP-Elites with Descriptor-Conditioned Gradients and Archive Distillation into a Single Policy
March 07, 2023 Β· Declared Dead Β· π Annual Conference on Genetic and Evolutionary Computation
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Maxence Faldor, FΓ©lix Chalumeau, Manon Flageat, Antoine Cully
arXiv ID
2303.03832
Category
cs.NE: Neural & Evolutionary
Citations
25
Venue
Annual Conference on Genetic and Evolutionary Computation
Last Checked
3 months ago
Abstract
Quality-Diversity algorithms, such as MAP-Elites, are a branch of Evolutionary Computation generating collections of diverse and high-performing solutions, that have been successfully applied to a variety of domains and particularly in evolutionary robotics. However, MAP-Elites performs a divergent search based on random mutations originating from Genetic Algorithms, and thus, is limited to evolving populations of low-dimensional solutions. PGA-MAP-Elites overcomes this limitation by integrating a gradient-based variation operator inspired by Deep Reinforcement Learning which enables the evolution of large neural networks. Although high-performing in many environments, PGA-MAP-Elites fails on several tasks where the convergent search of the gradient-based operator does not direct mutations towards archive-improving solutions. In this work, we present two contributions: (1) we enhance the Policy Gradient variation operator with a descriptor-conditioned critic that improves the archive across the entire descriptor space, (2) we exploit the actor-critic training to learn a descriptor-conditioned policy at no additional cost, distilling the knowledge of the archive into one single versatile policy that can execute the entire range of behaviors contained in the archive. Our algorithm, DCG-MAP-Elites improves the QD score over PGA-MAP-Elites by 82% on average, on a set of challenging locomotion tasks.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Neural & Evolutionary
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Progressive Growing of GANs for Improved Quality, Stability, and Variation
R.I.P.
π»
Ghosted
Learning both Weights and Connections for Efficient Neural Networks
R.I.P.
π»
Ghosted
LSTM: A Search Space Odyssey
R.I.P.
π»
Ghosted
A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks
R.I.P.
π»
Ghosted
An Introduction to Convolutional Neural Networks
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted