Source Traces for Temporal Difference Learning

February 08, 2019 ยท Entered Twilight ยท ๐Ÿ› AAAI Conference on Artificial Intelligence

๐ŸŒ… TWILIGHT: Old Age
Predates the code-sharing era โ€” a pioneer of its time

"No code URL or promise found in abstract"
"Code repo scraped from project page (backfill)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, .ipynb_checkpoints, 2D Norm Result Plotting.ipynb, Make3dAndTaxi.ipynb, Metric nearness example in Pytorch.ipynb, PrepXYDPickles.ipynb, custom_metric_loss_ops.py, data.py, experiment.py, experiment_2d_norm.py, metrics_pytorch.py, metrics_tf1.py, norm_utils.py, readme.md, requirements.txt

Authors Silviu Pitis arXiv ID 1902.02907 Category cs.LG: Machine Learning Cross-listed cs.AI, stat.ML Citations 19 Venue AAAI Conference on Artificial Intelligence Repository https://github.com/spitis/deepnorms โญ 11 Last Checked 15 days ago
Abstract
This paper motivates and develops source traces for temporal difference (TD) learning in the tabular setting. Source traces are like eligibility traces, but model potential histories rather than immediate ones. This allows TD errors to be propagated to potential causal states and leads to faster generalization. Source traces can be thought of as the model-based, backward view of successor representations (SR), and share many of the same benefits. This view, however, suggests several new ideas. First, a TD($ฮป$)-like source learning algorithm is proposed and its convergence is proven. Then, a novel algorithm for learning the source map (or SR matrix) is developed and shown to outperform the previous algorithm. Finally, various approaches to using the source/SR model are explored, and it is shown that source traces can be effectively combined with other model-based methods like Dyna and experience replay.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning