Stochastic Optimization of Sorting Networks via Continuous Relaxations

March 21, 2019 · Declared Dead · 🏛 International Conference on Learning Representations

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Aditya Grover, Eric Wang, Aaron Zweig, Stefano Ermon arXiv ID 1903.08850 Category stat.ML: Machine Learning (Stat) Cross-listed cs.LG, cs.NE Citations 200 Venue International Conference on Learning Representations Last Checked 1 month ago

Abstract

Sorting input objects is an important step in many machine learning pipelines. However, the sorting operator is non-differentiable with respect to its inputs, which prohibits end-to-end gradient-based optimization. In this work, we propose NeuralSort, a general-purpose continuous relaxation of the output of the sorting operator from permutation matrices to the set of unimodal row-stochastic matrices, where every row sums to one and has a distinct arg max. This relaxation permits straight-through optimization of any computational graph involve a sorting operation. Further, we use this relaxation to enable gradient-based stochastic optimization over the combinatorially large space of permutations by deriving a reparameterized gradient estimator for the Plackett-Luce family of distributions over permutations. We demonstrate the usefulness of our framework on three tasks that require learning semantic orderings of high-dimensional objects, including a fully differentiable, parameterized extension of the k-nearest neighbors algorithm.