Evolutionary Planning in Latent Space
November 23, 2020 ยท Entered Twilight ยท ๐ EvoApplications
"Last commit was 5.0 years ago (โฅ5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: .gitignore, README.md, config.json, environment, experiment_checklist.txt, iteration_retester.py, main.py, mdrnn, planning, run_headless.sh, scripts, tests_custom, tuning, utility, vae
Authors
Thor V. A. N. Olesen, Dennis T. T. Nguyen, Rasmus Berg Palm, Sebastian Risi
arXiv ID
2011.11293
Category
cs.LG: Machine Learning
Cross-listed
cs.NE
Citations
5
Venue
EvoApplications
Repository
https://github.com/two2tee/WorldModelPlanning
โญ 17
Last Checked
2 months ago
Abstract
Planning is a powerful approach to reinforcement learning with several desirable properties. However, it requires a model of the world, which is not readily available in many real-life problems. In this paper, we propose to learn a world model that enables Evolutionary Planning in Latent Space (EPLS). We use a Variational Auto Encoder (VAE) to learn a compressed latent representation of individual observations and extend a Mixture Density Recurrent Neural Network (MDRNN) to learn a stochastic, multi-modal forward model of the world that can be used for planning. We use the Random Mutation Hill Climbing (RMHC) to find a sequence of actions that maximize expected reward in this learned model of the world. We demonstrate how to build a model of the world by bootstrapping it with rollouts from a random policy and iteratively refining it with rollouts from an increasingly accurate planning policy using the learned world model. After a few iterations of this refinement, our planning agents are better than standard model-free reinforcement learning approaches demonstrating the viability of our approach.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
R.I.P.
๐ป
Ghosted
Semi-Supervised Classification with Graph Convolutional Networks
R.I.P.
๐ป
Ghosted
Proximal Policy Optimization Algorithms
R.I.P.
๐ป
Ghosted