Discovering Descriptive Tile Trees by Mining Optimal Geometric Subtiles
February 07, 2019 Β· Declared Dead Β· π ECML/PKDD
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Nikolaj Tatti, Jilles Vreeken
arXiv ID
1902.02861
Category
cs.DS: Data Structures & Algorithms
Citations
13
Venue
ECML/PKDD
Last Checked
3 months ago
Abstract
When analysing binary data, the ease at which one can interpret results is very important. Many existing methods, however, discover either models that are difficult to read, or return so many results interpretation becomes impossible. Here, we study a fully automated approach for mining easily interpretable models for binary data. We model data hierarchically with noisy tiles-rectangles with significantly different density than their parent tile. To identify good trees, we employ the Minimum Description Length principle. We propose STIJL, a greedy any-time algorithm for mining good tile trees from binary data. Iteratively, it finds the locally optimal addition to the current tree, allowing overlap with tiles of the same parent. A major result of this paper is that we find the optimal tile in only $Ξ(NM\min(N, M))$ time. STIJL can either be employed as a top-$k$ miner, or by MDL we can identify the tree that describes the data best. Experiments show we find succinct models that accurately summarise the data, and, by their hierarchical property are easily interpretable.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted