The Quality-Diversity Transformer: Generating Behavior-Conditioned Trajectories with Decision Transformers
March 27, 2023 Β· Declared Dead Β· π Annual Conference on Genetic and Evolutionary Computation
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Valentin MacΓ©, RaphaΓ«l Boige, Felix Chalumeau, Thomas Pierrot, Guillaume Richard, Nicolas Perrin-Gilbert
arXiv ID
2303.16207
Category
cs.NE: Neural & Evolutionary
Cross-listed
cs.AI
Citations
15
Venue
Annual Conference on Genetic and Evolutionary Computation
Last Checked
3 months ago
Abstract
In the context of neuroevolution, Quality-Diversity algorithms have proven effective in generating repertoires of diverse and efficient policies by relying on the definition of a behavior space. A natural goal induced by the creation of such a repertoire is trying to achieve behaviors on demand, which can be done by running the corresponding policy from the repertoire. However, in uncertain environments, two problems arise. First, policies can lack robustness and repeatability, meaning that multiple episodes under slightly different conditions often result in very different behaviors. Second, due to the discrete nature of the repertoire, solutions vary discontinuously. Here we present a new approach to achieve behavior-conditioned trajectory generation based on two mechanisms: First, MAP-Elites Low-Spread (ME-LS), which constrains the selection of solutions to those that are the most consistent in the behavior space. Second, the Quality-Diversity Transformer (QDT), a Transformer-based model conditioned on continuous behavior descriptors, which trains on a dataset generated by policies from a ME-LS repertoire and learns to autoregressively generate sequences of actions that achieve target behaviors. Results show that ME-LS produces consistent and robust policies, and that its combination with the QDT yields a single policy capable of achieving diverse behaviors on demand with high accuracy.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Neural & Evolutionary
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Progressive Growing of GANs for Improved Quality, Stability, and Variation
R.I.P.
π»
Ghosted
Learning both Weights and Connections for Efficient Neural Networks
R.I.P.
π»
Ghosted
LSTM: A Search Space Odyssey
R.I.P.
π»
Ghosted
A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks
R.I.P.
π»
Ghosted
An Introduction to Convolutional Neural Networks
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted