Learning Multilingual Word Embeddings in Latent Metric Space: A Geometric Approach

August 27, 2018 ยท Entered Twilight ยท ๐Ÿ› Transactions of the Association for Computational Linguistics

๐ŸŒ… TWILIGHT: Old Age
Predates the code-sharing era โ€” a pioneer of its time

"Last commit was 6.0 years ago (โ‰ฅ5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: LICENSE, README.md, embeddings.py, geomm.py, geomm_cmp_pip.py, geomm_multi.py, geomm_multi_results.sh, geomm_optimized.py, geomm_results.sh, geomm_semi.py, geomm_semi_results.sh, muse_data, requirements.txt, utils.py, utils_2.py, vecmap_data

Authors Pratik Jawanpuria, Arjun Balgovind, Anoop Kunchukuttan, Bamdev Mishra arXiv ID 1808.08773 Category cs.LG: Machine Learning Cross-listed cs.AI, cs.CL, stat.ML Citations 80 Venue Transactions of the Association for Computational Linguistics Repository https://github.com/anoopkunchukuttan/geomm โญ 26 Last Checked 1 month ago
Abstract
We propose a novel geometric approach for learning bilingual mappings given monolingual embeddings and a bilingual dictionary. Our approach decouples learning the transformation from the source language to the target language into (a) learning rotations for language-specific embeddings to align them to a common space, and (b) learning a similarity metric in the common space to model similarities between the embeddings. We model the bilingual mapping problem as an optimization problem on smooth Riemannian manifolds. We show that our approach outperforms previous approaches on the bilingual lexicon induction and cross-lingual word similarity tasks. We also generalize our framework to represent multiple languages in a common latent space. In particular, the latent space representations for several languages are learned jointly, given bilingual dictionaries for multiple language pairs. We illustrate the effectiveness of joint learning for multiple languages in zero-shot word translation setting. Our implementation is available at https://github.com/anoopkunchukuttan/geomm .
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning