End-to-End Entity Resolution for Big Data: A Survey
May 15, 2019 ยท The Cartographer ยท ๐ arXiv.org
"No code URL or promise found in abstract"
"Title-pattern auto-detect: End-to-End Entity Resolution for Big Data: A Survey"
Evidence collected by the PWNC Scanner
Authors
Vassilis Christophides, Vasilis Efthymiou, Themis Palpanas, George Papadakis, Kostas Stefanidis
arXiv ID
1905.06397
Category
cs.DB: Databases
Citations
64
Venue
arXiv.org
Last Checked
8 days ago
Abstract
One of the most important tasks for improving data quality and the reliability of data analytics results is Entity Resolution (ER). ER aims to identify different descriptions that refer to the same real-world entity, and remains a challenging problem. While previous works have studied specific aspects of ER (and mostly in traditional settings), in this survey, we provide for the first time an end-to-end view of modern ER workflows, and of the novel aspects of entity indexing and matching methods in order to cope with more than one of the Big Data characteristics simultaneously. We present the basic concepts, processing steps and execution strategies that have been proposed by different communities, i.e., database, semantic Web and machine learning, in order to cope with the loose structuredness, extreme diversity, high speed and large scale of entity descriptions used by real-world applications. Finally, we provide a synthetic discussion of the existing approaches, and conclude with a detailed presentation of open research directions.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Databases
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
Untangling Blockchain: A Data Processing View of Blockchain Systems
R.I.P.
๐ป
Ghosted
Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades
R.I.P.
๐ป
Ghosted
BLOCKBENCH: A Framework for Analyzing Private Blockchains
R.I.P.
๐ป
Ghosted
Data Synthesis based on Generative Adversarial Networks
R.I.P.
๐ป
Ghosted