Optimal Hashing in External Memory
May 23, 2018 Β· Declared Dead Β· π International Colloquium on Automata, Languages and Programming
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Alex Conway, Martin Farach-Colton, Philip Shilane
arXiv ID
1805.09423
Category
cs.DS: Data Structures & Algorithms
Citations
22
Venue
International Colloquium on Automata, Languages and Programming
Last Checked
3 months ago
Abstract
Hash tables are a ubiquitous class of dictionary data structures. However, standard hash table implementations do not translate well into the external memory model, because they do not incorporate locality for insertions. Iacono and Patracsu established an update/query tradeoff curve for external hash tables: a hash table that performs insertions in $O(Ξ»/B)$ amortized IOs requires $Ξ©(\log_Ξ»N)$ expected IOs for queries, where $N$ is the number of items that can be stored in the data structure, $B$ is the size of a memory transfer, $M$ is the size of memory, and $Ξ»$ is a tuning parameter. They provide a hashing data structure that meets this curve for $Ξ»$ that is $Ξ©(\log\log M + \log_M N)$. Their data structure, which we call an \defn{IP hash table}, is complicated and, to the best of our knowledge, has not been implemented. In this paper, we present a new and much simpler optimal external memory hash table, the \defn{Bundle of Arrays Hash Table} (BOA). BOAs are based on size-tiered LSMs, a well-studied data structure, and are almost as easy to implement. The BOA is optimal for a narrower range of $Ξ»$. However, the simplicity of BOAs allows them to be readily modified to achieve the following results: \begin{itemize} \item A new external memory data structure, the \defn{Bundle of Trees Hash Table} (BOT), that matches the performance of the IP hash table, while retaining some of the simplicity of the BOAs. \item The \defn{cache-oblivious Bundle of Trees Hash Table} (COBOT), the first cache-oblivious hash table. This data structure matches the optimality of BOTs and IP hash tables over the same range of $Ξ»$. \end{itemize}
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted