Approximate Summaries for Why and Why-not Provenance (Extended Version)

January 31, 2020 ยท Declared Dead ยท ๐Ÿ› Proceedings of the VLDB Endowment

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Seokki Lee, Bertram Ludaescher, Boris Glavic arXiv ID 2002.00084 Category cs.DB: Databases Citations 29 Venue Proceedings of the VLDB Endowment Last Checked 3 months ago
Abstract
Why and why-not provenance have been studied extensively in recent years. However, why-not provenance, and to a lesser degree why provenance, can be very large resulting in severe scalability and usability challenges. In this paper, we introduce a novel approximate summarization technique for provenance which overcomes these challenges. Our approach uses patterns to encode (why-not) provenance concisely. We develop techniques for efficiently computing provenance summaries balancing informativeness, conciseness, and completeness. To achieve scalability, we integrate sampling techniques into provenance capture and summarization. Our approach is the first to scale to large datasets and to generate comprehensive and meaningful summaries.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Databases

R.I.P. ๐Ÿ‘ป Ghosted

Datasheets for Datasets

Timnit Gebru, Jamie Morgenstern, ... (+5 more)

cs.DB ๐Ÿ› CACM ๐Ÿ“š 2.6K cites 8 years ago

Died the same way โ€” ๐Ÿ‘ป Ghosted