Gating Mechanisms for Combining Character and Word-level Word Representations: An Empirical Study
April 11, 2019 ยท Entered Twilight ยท ๐ North American Chapter of the Association for Computational Linguistics
"Last commit was 6.0 years ago (โฅ5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: .gitignore, LICENSE, README.md, SentEval, base_args.py, data, requirements.txt, scripts, src, train.py
Authors
Jorge A. Balazs, Yutaka Matsuo
arXiv ID
1904.05584
Category
cs.CL: Computation & Language
Cross-listed
stat.ML
Citations
3
Venue
North American Chapter of the Association for Computational Linguistics
Repository
https://github.com/jabalazs/gating
โญ 7
Last Checked
1 month ago
Abstract
In this paper we study how different ways of combining character and word-level representations affect the quality of both final word and sentence representations. We provide strong empirical evidence that modeling characters improves the learned representations at the word and sentence levels, and that doing so is particularly useful when representing less frequent words. We further show that a feature-wise sigmoid gating mechanism is a robust method for creating representations that encode semantic similarity, as it performed reasonably well in several word similarity datasets. Finally, our findings suggest that properly capturing semantic similarity at the word level does not consistently yield improved performance in downstream sentence-level tasks. Our code is available at https://github.com/jabalazs/gating
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
RoBERTa: A Robustly Optimized BERT Pretraining Approach
R.I.P.
๐ป
Ghosted
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
R.I.P.
๐ป
Ghosted