Neural Machine Translation with Recurrent Attention Modeling

July 18, 2016 · Declared Dead · 🏛 Conference of the European Chapter of the Association for Computational Linguistics

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Zichao Yang, Zhiting Hu, Yuntian Deng, Chris Dyer, Alex Smola arXiv ID 1607.05108 Category cs.NE: Neural & Evolutionary Cross-listed cs.CL Citations 53 Venue Conference of the European Chapter of the Association for Computational Linguistics Last Checked 3 months ago

Abstract

Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future. We improve upon the attention model of Bahdanau et al. (2014) by explicitly modeling the relationship between previous and subsequent attention levels for each word using one recurrent network per input word. This architecture easily captures informative features, such as fertility and regularities in relative distortion. In experiments, we show our parameterization of attention improves translation quality.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Neural & Evolutionary

R.I.P. 👻 Ghosted

A Style-Based Generator Architecture for Generative Adversarial Networks

Tero Karras, Samuli Laine, Timo Aila

cs.NE 🏛 CVPR 📚 12.3K cites 7 years ago

R.I.P. 👻 Ghosted

Progressive Growing of GANs for Improved Quality, Stability, and Variation

Tero Karras, Timo Aila, ... (+2 more)

cs.NE 🏛 ICLR 📚 8.2K cites 8 years ago

R.I.P. 👻 Ghosted

Learning both Weights and Connections for Efficient Neural Networks

Song Han, Jeff Pool, ... (+2 more)

cs.NE 🏛 NeurIPS 📚 7.4K cites 10 years ago

R.I.P. 👻 Ghosted

LSTM: A Search Space Odyssey

Klaus Greff, Rupesh Kumar Srivastava, ... (+3 more)

cs.NE 🏛 IEEE TNNLS 📚 6.0K cites 11 years ago

R.I.P. 👻 Ghosted

A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks

Dan Hendrycks, Kevin Gimpel

cs.NE 🏛 ICLR 📚 4.0K cites 9 years ago

R.I.P. 👻 Ghosted

An Introduction to Convolutional Neural Networks

Keiron O'Shea, Ryan Nash

cs.NE 🏛 arXiv 📚 3.8K cites 10 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, ... (+29 more)

cs.CL 🏛 NeurIPS 📚 54.2K cites 6 years ago

R.I.P. 👻 Ghosted

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, ... (+19 more)

cs.LG 🏛 NeurIPS 📚 49.7K cites 6 years ago

R.I.P. 👻 Ghosted

XGBoost: A Scalable Tree Boosting System

Tianqi Chen, Carlos Guestrin

cs.LG 🏛 KDD 📚 49.2K cites 10 years ago

R.I.P. 👻 Ghosted

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

cs.LG 🏛 ICML 📚 46.0K cites 11 years ago