A Benchmark of Rule-Based and Neural Coreference Resolution in Dutch Novels and News

November 03, 2020 · Entered Twilight · 🏛 CRAC

"Last commit was 5.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, .travis.yml, LICENSE, README.md, e2edutch, requirements.txt, scripts, setup.py, test

Authors Corbèn Poot, Andreas van Cranenburgh arXiv ID 2011.01615 Category cs.CL: Computation & Language Citations 18 Venue CRAC Repository https://github.com/andreasvc/crac2020 ⭐ 4 Last Checked 1 month ago

Abstract

We evaluate a rule-based (Lee et al., 2013) and neural (Lee et al., 2018) coreference system on Dutch datasets of two domains: literary novels and news/Wikipedia text. The results provide insight into the relative strengths of data-driven and knowledge-driven systems, as well as the influence of domain, document length, and annotation schemes. The neural system performs best on news/Wikipedia text, while the rule-based system performs best on literature. The neural system shows weaknesses with limited training data and long documents, while the rule-based system is affected by annotation differences. The code and models used in this paper are available at https://github.com/andreasvc/crac2020