Duet: efficient and scalable hybriD neUral rElation undersTanding
July 25, 2023 Β· Declared Dead Β· π IEEE International Conference on Data Engineering
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Kaixin Zhang, Hongzhi Wang, Yabin Lu, Ziqi Li, Chang Shu, Yu Yan, Donghua Yang
arXiv ID
2307.13494
Category
cs.DB: Databases
Cross-listed
cs.AI,
cs.LG
Citations
4
Venue
IEEE International Conference on Data Engineering
Last Checked
3 months ago
Abstract
Learned cardinality estimation methods have achieved high precision compared to traditional methods. Among learned methods, query-driven approaches have faced the workload drift problem for a long time. Although both data-driven and hybrid methods are proposed to avoid this problem, most of them suffer from high training and estimation costs, limited scalability, instability, and long-tail distribution problems on high-dimensional tables, which seriously affects the practical application of learned cardinality estimators. In this paper, we prove that most of these problems are directly caused by the widely used progressive sampling. We solve this problem by introducing predicate information into the autoregressive model and propose Duet, a stable, efficient, and scalable hybrid method to estimate cardinality directly without sampling or any non-differentiable process, which can not only reduce the inference complexity from $O(n)$ to $O(1)$ compared to Naru and UAE but also achieve higher accuracy on high cardinality and high-dimensional tables. Experimental results show that Duet can achieve all the design goals above and be much more practical. Besides, Duet even has a lower inference cost on CPU than that of most learned methods on GPU.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Databases
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Untangling Blockchain: A Data Processing View of Blockchain Systems
R.I.P.
π»
Ghosted
Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades
R.I.P.
π»
Ghosted
BLOCKBENCH: A Framework for Analyzing Private Blockchains
R.I.P.
π»
Ghosted
Data Synthesis based on Generative Adversarial Networks
R.I.P.
π»
Ghosted
HoloClean: Holistic Data Repairs with Probabilistic Inference
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted