Fathom: Reference Workloads for Modern Deep Learning Methods

August 23, 2016 · Declared Dead · 🏛 IEEE International Symposium on Workload Characterization

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Robert Adolf, Saketh Rama, Brandon Reagen, Gu-Yeon Wei, David Brooks arXiv ID 1608.06581 Category cs.LG: Machine Learning Citations 186 Venue IEEE International Symposium on Workload Characterization Last Checked 4 months ago

Abstract

Deep learning has been popularized by its recent successes on challenging artificial intelligence problems. One of the reasons for its dominance is also an ongoing challenge: the need for immense amounts of computational power. Hardware architects have responded by proposing a wide array of promising ideas, but to date, the majority of the work has focused on specific algorithms in somewhat narrow application domains. While their specificity does not diminish these approaches, there is a clear need for more flexible solutions. We believe the first step is to examine the characteristics of cutting edge models from across the deep learning community. Consequently, we have assembled Fathom: a collection of eight archetypal deep learning workloads for study. Each of these models comes from a seminal work in the deep learning community, ranging from the familiar deep convolutional neural network of Krizhevsky et al., to the more exotic memory networks from Facebook's AI research group. Fathom has been released online, and this paper focuses on understanding the fundamental performance characteristics of each model. We use a set of application-level modeling tools built around the TensorFlow deep learning framework in order to analyze the behavior of the Fathom workloads. We present a breakdown of where time is spent, the similarities between the performance profiles of our models, an analysis of behavior in inference and training, and the effects of parallelism on scaling.