Decoupling Representation Learning from Reinforcement Learning
September 14, 2020 ยท Entered Twilight ยท ๐ International Conference on Machine Learning
"Last commit was 5.0 years ago (โฅ5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: .github, .gitignore, CHANGELOG.md, CONTRIBUTING.md, LICENSE, README.md, data, docs, examples, images, linux_cpu.yml, linux_cuda10.yml, linux_cuda9.yml, macos_cpu.yml, rlpyt, scratch, setup.py, tests
Authors
Adam Stooke, Kimin Lee, Pieter Abbeel, Michael Laskin
arXiv ID
2009.08319
Category
cs.LG: Machine Learning
Cross-listed
cs.AI,
cs.CV,
stat.ML
Citations
384
Venue
International Conference on Machine Learning
Repository
https://github.com/astooke/rlpyt/tree/master/rlpyt/ul
โญ 2275
Last Checked
1 month ago
Abstract
In an effort to overcome limitations of reward-driven feature learning in deep reinforcement learning (RL) from images, we propose decoupling representation learning from policy learning. To this end, we introduce a new unsupervised learning (UL) task, called Augmented Temporal Contrast (ATC), which trains a convolutional encoder to associate pairs of observations separated by a short time difference, under image augmentations and using a contrastive loss. In online RL experiments, we show that training the encoder exclusively using ATC matches or outperforms end-to-end RL in most environments. Additionally, we benchmark several leading UL algorithms by pre-training encoders on expert demonstrations and using them, with weights frozen, in RL agents; we find that agents using ATC-trained encoders outperform all others. We also train multi-task encoders on data from multiple environments and show generalization to different downstream RL tasks. Finally, we ablate components of ATC, and introduce a new data augmentation to enable replay of (compressed) latent images from pre-trained encoders when RL requires augmentation. Our experiments span visually diverse RL benchmarks in DeepMind Control, DeepMind Lab, and Atari, and our complete code is available at https://github.com/astooke/rlpyt/tree/master/rlpyt/ul.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
R.I.P.
๐ป
Ghosted
Semi-Supervised Classification with Graph Convolutional Networks
R.I.P.
๐ป
Ghosted
Proximal Policy Optimization Algorithms
R.I.P.
๐ป
Ghosted