StrassenNets: Deep Learning with a Multiplication Budget

December 11, 2017 ยท Entered Twilight ยท ๐Ÿ› International Conference on Machine Learning

๐ŸŒ… TWILIGHT: Old Age
Predates the code-sharing era โ€” a pioneer of its time

"Last commit was 7.0 years ago (โ‰ฅ5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: PosterStrassenNets.pdf, README.md, StrassenNetsCIFAR10.ipynb, StrassenNetworksCIFAR10_VGG.py, common, langmod

Authors Michael Tschannen, Aran Khanna, Anima Anandkumar arXiv ID 1712.03942 Category cs.LG: Machine Learning Cross-listed cs.CV Citations 30 Venue International Conference on Machine Learning Repository https://github.com/mitscha/strassennets โญ 47 Last Checked 1 month ago
Abstract
A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) consists of matrix multiplications, in both convolution and fully connected layers. We perform end-to-end learning of low-cost approximations of matrix multiplications in DNN layers by casting matrix multiplications as 2-layer sum-product networks (SPNs) (arithmetic circuits) and learning their (ternary) edge weights from data. The SPNs disentangle multiplication and addition operations and enable us to impose a budget on the number of multiplication operations. Combining our method with knowledge distillation and applying it to image classification DNNs (trained on ImageNet) and language modeling DNNs (using LSTMs), we obtain a first-of-a-kind reduction in number of multiplications (over 99.5%) while maintaining the predictive performance of the full-precision models. Finally, we demonstrate that the proposed framework is able to rediscover Strassen's matrix multiplication algorithm, learning to multiply $2 \times 2$ matrices using only 7 multiplications instead of 8.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning