Multi-Resolution Hashing for Fast Pairwise Summations
July 19, 2018 Β· Declared Dead Β· π IEEE Annual Symposium on Foundations of Computer Science
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Moses Charikar, Paris Siminelakis
arXiv ID
1807.07635
Category
cs.DS: Data Structures & Algorithms
Citations
12
Venue
IEEE Annual Symposium on Foundations of Computer Science
Last Checked
4 months ago
Abstract
A basic computational primitive in the analysis of massive datasets is summing simple functions over a large number of objects. Modern applications pose an additional challenge in that such functions often depend on a parameter vector $y$ (query) that is unknown a priori. Given a set of points $X\subset \mathbb{R}^{d}$ and a pairwise function $w:\mathbb{R}^{d}\times \mathbb{R}^{d}\to [0,1]$, we study the problem of designing a data-structure that enables sublinear-time approximation of the summation $Z_{w}(y)=\frac{1}{|X|}\sum_{x\in X}w(x,y)$ for any query $y\in \mathbb{R}^{d}$. By combining ideas from Harmonic Analysis (partitions of unity and approximation theory) with Hashing-Based-Estimators [Charikar, Siminelakis FOCS'17], we provide a general framework for designing such data structures through hashing that reaches far beyond what previous techniques allowed. A key design principle is a collection of $T\geq 1$ hashing schemes with collision probabilities $p_{1},\ldots, p_{T}$ such that $\sup_{t\in [T]}\{p_{t}(x,y)\} = Ξ(\sqrt{w(x,y)})$. This leads to a data-structure that approximates $Z_{w}(y)$ using a sub-linear number of samples from each hash family. Using this new framework along with Distance Sensitive Hashing [Aumuller, Christiani, Pagh, Silvestri PODS'18], we show that such a collection can be constructed and evaluated efficiently for any log-convex function $w(x,y)=e^{Ο(\langle x,y\rangle)}$ of the inner product on the unit sphere $x,y\in \mathcal{S}^{d-1}$. Our method leads to data structures with sub-linear query time that significantly improve upon random sampling and can be used for Kernel Density or Partition Function Estimation. We provide extensions of our result from the sphere to $\mathbb{R}^{d}$ and from scalar functions to vector functions.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted