A Bag of Tricks for Scaling CPU-based Deep FFMs to more than 300m Predictions per Second

July 14, 2024 ยท Declared Dead ยท ๐Ÿ› AdKDD@KDD

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Blaลพ ล krlj, Benjamin Ben-Shalom, Grega Gaลกperลกiฤ, Adi Schwartz, Ramzi Hoseisi, Naama Ziporin, Davorin Kopiฤ, Andraลพ Tori arXiv ID 2407.10115 Category cs.LG: Machine Learning Cross-listed cs.AI, cs.IR Citations 1 Venue AdKDD@KDD Last Checked 4 months ago
Abstract
Field-aware Factorization Machines (FFMs) have emerged as a powerful model for click-through rate prediction, particularly excelling in capturing complex feature interactions. In this work, we present an in-depth analysis of our in-house, Rust-based Deep FFM implementation, and detail its deployment on a CPU-only, multi-data-center scale. We overview key optimizations devised for both training and inference, demonstrated by previously unpublished benchmark results in efficient model search and online training. Further, we detail an in-house weight quantization that resulted in more than an order of magnitude reduction in bandwidth footprint related to weight transfers across data-centres. We disclose the engine and associated techniques under an open-source license to contribute to the broader machine learning community. This paper showcases one of the first successful CPU-only deployments of Deep FFMs at such scale, marking a significant stride in practical, low-footprint click-through rate prediction methodologies.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning

Died the same way โ€” ๐Ÿ‘ป Ghosted