🌅
🌅
Old Age
Quoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning
August 16, 2019 · 🏛 Conference on Empirical Methods in Natural Language Processing
"No code URL or promise found in abstract"
"HuggingFace models found (backfill)"
Evidence collected by the PWNC Scanner
Authors
Pradeep Dasigi, Nelson F. Liu, Ana Marasović, Noah A. Smith, Matt Gardner
arXiv ID
1908.05803
Category
cs.CL: Computation & Language
Citations
186
Venue
Conference on Empirical Methods in Natural Language Processing
Repository
https://huggingface.co/datasets/allenai/quoref
Last Checked
9 days ago
Abstract
Machine comprehension of texts longer than a single sentence often requires coreference resolution. However, most current reading comprehension benchmarks do not contain complex coreferential phenomena and hence fail to evaluate the ability of models to resolve coreference. We present a new crowdsourced dataset containing more than 24K span-selection questions that require resolving coreference among entities in over 4.7K English paragraphs from Wikipedia. Obtaining questions focused on such phenomena is challenging, because it is hard to avoid lexical cues that shortcut complex reasoning. We deal with this issue by using a strong baseline model as an adversary in the crowdsourcing loop, which helps crowdworkers avoid writing questions with exploitable surface cues. We show that state-of-the-art reading comprehension models perform significantly worse than humans on this benchmark---the best model performance is 70.5 F1, while the estimated human performance is 93.4 F1.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
📜 Similar Papers
In the same crypt — Computation & Language
🌅
🌅
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
🌅
🌅
Old Age
XLNet: Generalized Autoregressive Pretraining for Language Understanding
🔮
🔮
The Ethereal
Effective Approaches to Attention-based Neural Machine Translation
🌅
🌅
Old Age
A large annotated corpus for learning natural language inference
🌅
🌅
Old Age