On the Reproducibility of Experiments of Indexing Repetitive Document Collections
December 26, 2019 Β· Entered Twilight Β· π Joint Conference of the Information Retrieval Communities in Europe
"Last commit was 5.0 years ago (β₯5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: Dockerfile, LICENSE.md, README.md, docker, uiHRDC
Authors
Antonio FariΓ±a, Miguel A. MartΓnez-Prieto, Francisco Claude, Gonzalo Navarro, Juan J. Lastra-DΓaz, Nicola Prezza, Diego Seco
arXiv ID
1912.11944
Category
cs.DS: Data Structures & Algorithms
Cross-listed
cs.PF
Citations
3
Venue
Joint Conference of the Information Retrieval Communities in Europe
Repository
https://github.com/migumar2/uiHRDC/
β 24
Last Checked
2 months ago
Abstract
This work introduces a companion reproducible paper with the aim of allowing the exact replication of the methods, experiments, and results discussed in a previous work [5]. In that parent paper, we proposed many and varied techniques for compressing indexes which exploit that highly repetitive collections are formed mostly of documents that are near-copies of others. More concretely, we describe a replication framework, called uiHRDC (universal indexes for Highly Repetitive Document Collections), that allows our original experimental setup to be easily replicated using various document collections. The corresponding experimentation is carefully explained, providing precise details about the parameters that can be tuned for each indexing solution. Finally, note that we also provide uiHRDC as reproducibility package.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Relief-Based Feature Selection: Introduction and Review
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted