Bayes-Adaptive Deep Model-Based Policy Optimisation
October 29, 2020 ยท Entered Twilight ยท ๐ arXiv.org
"Last commit was 5.0 years ago (โฅ5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: .gitignore, README.md, bnn_trpo.py, environments, lib, models, notebooks, online_bnn_trpo.py, params, policy, requirements.txt, rllab_algos
Authors
Tai Hoang, Ngo Anh Vien
arXiv ID
2010.15948
Category
cs.RO: Robotics
Cross-listed
cs.LG
Citations
2
Venue
arXiv.org
Repository
https://github.com/thobotics/RoMBRL
โญ 3
Last Checked
2 months ago
Abstract
We introduce a Bayesian (deep) model-based reinforcement learning method (RoMBRL) that can capture model uncertainty to achieve sample-efficient policy optimisation. We propose to formulate the model-based policy optimisation problem as a Bayes-adaptive Markov decision process (BAMDP). RoMBRL maintains model uncertainty via belief distributions through a deep Bayesian neural network whose samples are generated via stochastic gradient Hamiltonian Monte Carlo. Uncertainty is propagated through simulations controlled by sampled models and history-based policies. As beliefs are encoded in visited histories, we propose a history-based policy network that can be end-to-end trained to generalise across history space and will be trained using recurrent Trust-Region Policy Optimisation. We show that RoMBRL outperforms existing approaches on many challenging control benchmark tasks in terms of sample complexity and task performance. The source code of this paper is also publicly available on https://github.com/thobotics/RoMBRL.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Robotics
๐
๐
Old Age
R.I.P.
๐ป
Ghosted
ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras
R.I.P.
๐ป
Ghosted
VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator
R.I.P.
๐ป
Ghosted
ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM
R.I.P.
๐ป
Ghosted
Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World
R.I.P.
๐ป
Ghosted