Working Memory Networks: Augmenting Memory Networks with a Relational Reasoning Module
May 23, 2018 Β· Entered Twilight Β· π Annual Meeting of the Association for Computational Linguistics
"Last commit was 7.0 years ago (β₯5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: README.md, RN_bAbI.py, WMemNN_NLVR.py, WMemNN_bAbI.py, plots, prepare.py, requeriments.txt, utils.py
Authors
Juan Pavez, HΓ©ctor Allende, HΓ©ctor Allende-Cid
arXiv ID
1805.09354
Category
cs.CL: Computation & Language
Citations
22
Venue
Annual Meeting of the Association for Computational Linguistics
Repository
https://github.com/jgpavez/Working-Memory-Networks
β 26
Last Checked
1 month ago
Abstract
During the last years, there has been a lot of interest in achieving some kind of complex reasoning using deep neural networks. To do that, models like Memory Networks (MemNNs) have combined external memory storages and attention mechanisms. These architectures, however, lack of more complex reasoning mechanisms that could allow, for instance, relational reasoning. Relation Networks (RNs), on the other hand, have shown outstanding results in relational reasoning tasks. Unfortunately, their computational cost grows quadratically with the number of memories, something prohibitive for larger problems. To solve these issues, we introduce the Working Memory Network, a MemNN architecture with a novel working memory storage and reasoning module. Our model retains the relational reasoning abilities of the RN while reducing its computational complexity from quadratic to linear. We tested our model on the text QA dataset bAbI and the visual QA dataset NLVR. In the jointly trained bAbI-10k, we set a new state-of-the-art, achieving a mean error of less than 0.5%. Moreover, a simple ensemble of two of our models solves all 20 tasks in the joint version of the benchmark.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Computation & Language
π
π
Old Age
π
π
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
RoBERTa: A Robustly Optimized BERT Pretraining Approach
R.I.P.
π»
Ghosted
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
R.I.P.
π»
Ghosted