Regression as Classification: Influence of Task Formulation on Neural Network Features

November 10, 2022 · Declared Dead · 🏛 International Conference on Artificial Intelligence and Statistics

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Lawrence Stewart, Francis Bach, Quentin Berthet, Jean-Philippe Vert arXiv ID 2211.05641 Category cs.LG: Machine Learning Cross-listed cs.AI, stat.ML Citations 35 Venue International Conference on Artificial Intelligence and Statistics Last Checked 3 months ago

Abstract

Neural networks can be trained to solve regression problems by using gradient-based methods to minimize the square loss. However, practitioners often prefer to reformulate regression as a classification problem, observing that training on the cross entropy loss results in better performance. By focusing on two-layer ReLU networks, which can be fully characterized by measures over their feature space, we explore how the implicit bias induced by gradient-based optimization could partly explain the above phenomenon. We provide theoretical evidence that the regression formulation yields a measure whose support can differ greatly from that for classification, in the case of one-dimensional data. Our proposed optimal supports correspond directly to the features learned by the input layer of the network. The different nature of these supports sheds light on possible optimization difficulties the square loss could encounter during training, and we present empirical results illustrating this phenomenon.