Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing
July 02, 2018 ยท The Cartographer ยท ๐ Computational Linguistics
"No code URL or promise found in abstract"
"Title-pattern auto-detect: Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language"
Evidence collected by the PWNC Scanner
Authors
Edoardo Maria Ponti, Helen O'Horan, Yevgeni Berzak, Ivan Vuliฤ, Roi Reichart, Thierry Poibeau, Ekaterina Shutova, Anna Korhonen
arXiv ID
1807.00914
Category
cs.CL: Computation & Language
Citations
152
Venue
Computational Linguistics
Last Checked
7 days ago
Abstract
Linguistic typology aims to capture structural and semantic variation across the world's languages. A large-scale typology could provide excellent guidance for multilingual Natural Language Processing (NLP), particularly for languages that suffer from the lack of human labeled resources. We present an extensive literature survey on the use of typological information in the development of NLP techniques. Our survey demonstrates that to date, the use of information in existing typological databases has resulted in consistent but modest improvements in system performance. We show that this is due to both intrinsic limitations of databases (in terms of coverage and feature granularity) and under-employment of the typological features included in them. We advocate for a new approach that adapts the broad and discrete nature of typological categories to the contextual and continuous nature of machine learning algorithms used in contemporary NLP. In particular, we suggest that such approach could be facilitated by recent developments in data-driven induction of typological knowledge.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
๐
๐
Old Age
XLNet: Generalized Autoregressive Pretraining for Language Understanding
๐ฎ
๐ฎ
The Ethereal
Effective Approaches to Attention-based Neural Machine Translation
๐
๐
Old Age
A large annotated corpus for learning natural language inference
๐
๐
Old Age