Crowdsourced correlation clustering with relative distance comparisons
September 25, 2017 Β· Declared Dead Β· π Industrial Conference on Data Mining
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Antti Ukkonen
arXiv ID
1709.08459
Category
cs.DS: Data Structures & Algorithms
Citations
23
Venue
Industrial Conference on Data Mining
Last Checked
3 months ago
Abstract
Crowdsourced, or human computation based clustering algorithms usually rely on relative distance comparisons, as these are easier to elicit from human workers than absolute distance information. A relative distance comparison is a statement of the form "item A is closer to item B than to item C". However, many existing clustering algorithms that use relative distances are rather complex. They are often based on a two-step approach, where the relative distances are first used to learn either a distance matrix, or an embedding of the items, and then some standard clustering method is applied in a second step. In this paper we argue that it should be possible to compute a clustering directly from relative distance comparisons. Our ideas are built upon existing work on correlation clustering, a well-known non-parametric approach to clustering. The technical contribution of this work is twofold. We first define a novel variant of correlation clustering that is based on relative distance comparisons, and hence suitable for human computation. We go on to show that our new problem is closely related to basic correlation clustering, and use this property to design an approximation algorithm for our problem. As a second contribution, we propose a more practical algorithm, which we empirically compare against existing methods from literature. Experiments with synthetic data suggest that our approach can outperform more complex methods. Also, our method efficiently finds good and intuitive clusterings from real relative distance comparison data.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted