EntroGD: Scalable Generalized Deduplication for Efficient Direct Analytics on Compressed IoT Data

November 06, 2025 ยท Declared Dead ยท ๐Ÿ› the IEEE INFOCOM 2026 Workshop on Fusion of Data

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Xiaobo Zhao, Daniel E. Lucani arXiv ID 2511.04148 Category cs.DB: Databases Citations 0 Venue the IEEE INFOCOM 2026 Workshop on Fusion of Data Last Checked 3 months ago
Abstract
Massive data streams from IoT and cyber-physical systems must be processed under strict bandwidth, latency, and resource constraints. Generalized Deduplication (GD) is a promising lossless compression framework, as it supports random access and direct analytics on compressed data. However, existing GD algorithms exhibit quadratic complexity $\mathcal{O}(nd^{2})$, which limits their scalability for high-dimensional datasets. This paper proposes \textbf{EntroGD}, an entropy-guided GD framework that decouples analytical fidelity from compression efficiency to achieve linear complexity $\mathcal{O}(nd)$. EntroGD adopts a two-stage design, first constructing compact condensed samples to preserve information critical for analytics, and then applying entropy-based bit selection to maximize compression. Experiments on 18 IoT datasets show that EntroGD reduces configuration time by up to $53.5\times$ compared to state-of-the-art GD compressors. Moreover, by enabling analytics with access to only $2.6\%$ of the original data volume, EntroGD accelerates clustering by up to $31.6\times$ with negligible loss in accuracy. Overall, EntroGD provides a scalable and system-efficient solution for direct analytics on compressed IoT data.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Databases

R.I.P. ๐Ÿ‘ป Ghosted

Datasheets for Datasets

Timnit Gebru, Jamie Morgenstern, ... (+5 more)

cs.DB ๐Ÿ› CACM ๐Ÿ“š 2.6K cites 8 years ago

Died the same way โ€” ๐Ÿ‘ป Ghosted