Learning what matters - Sampling interesting patterns

February 07, 2017 · Declared Dead · 🏛 Pacific-Asia Conference on Knowledge Discovery and Data Mining

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Vladimir Dzyuba, Matthijs van Leeuwen arXiv ID 1702.01975 Category stat.ML: Machine Learning (Stat) Cross-listed cs.AI, cs.DB Citations 15 Venue Pacific-Asia Conference on Knowledge Discovery and Data Mining Last Checked 3 months ago

Abstract

In the field of exploratory data mining, local structure in data can be described by patterns and discovered by mining algorithms. Although many solutions have been proposed to address the redundancy problems in pattern mining, most of them either provide succinct pattern sets or take the interests of the user into account-but not both. Consequently, the analyst has to invest substantial effort in identifying those patterns that are relevant to her specific interests and goals. To address this problem, we propose a novel approach that combines pattern sampling with interactive data mining. In particular, we introduce the LetSIP algorithm, which builds upon recent advances in 1) weighted sampling in SAT and 2) learning to rank in interactive pattern mining. Specifically, it exploits user feedback to directly learn the parameters of the sampling distribution that represents the user's interests. We compare the performance of the proposed algorithm to the state-of-the-art in interactive pattern mining by emulating the interests of a user. The resulting system allows efficient and interleaved learning and sampling, thus user-specific anytime data exploration. Finally, LetSIP demonstrates favourable trade-offs concerning both quality-diversity and exploitation-exploration when compared to existing methods.