R.I.P.
๐ป
Ghosted
Sorbet: A Neuromorphic Hardware-Compatible Transformer-Based Spiking Language Model
September 04, 2024 ยท Declared Dead ยท ๐ International Conference on Machine Learning
Authors
Kaiwen Tang, Zhanglu Yan, Weng-Fai Wong
arXiv ID
2409.15298
Category
cs.NE: Neural & Evolutionary
Cross-listed
cs.CL,
cs.LG
Citations
8
Venue
International Conference on Machine Learning
Repository
https://github.com/Kaiwen-Tang/Sorbet}{https://github.com/Kaiwen-Tang/Sorbet}
Last Checked
1 month ago
Abstract
For reasons such as privacy, there are use cases for language models at the edge. This has given rise to small language models targeted for deployment in resource-constrained devices where energy efficiency is critical. Spiking neural networks (SNNs) offer a promising solution due to their energy efficiency, and there are already works on realizing transformer-based models on SNNs. However, key operations like softmax and layer normalization (LN) are difficult to implement on neuromorphic hardware, and many of these early works sidestepped them. To address these challenges, we introduce Sorbet, a transformer-based spiking language model that is more neuromorphic hardware-compatible. Sorbet incorporates a novel shifting-based softmax called PTsoftmax and a Bit Shifting PowerNorm (BSPN), both designed to replace the respective energy-intensive operations. By leveraging knowledge distillation and model quantization, Sorbet achieved a highly compressed binary weight model that maintains competitive performance while achieving $27.16\times$ energy savings compared to BERT. We validate Sorbet through extensive testing on the GLUE benchmark and a series of ablation studies, demonstrating its potential as an energy-efficient solution for language model inference. Our code is publicly available at \href{https://github.com/Kaiwen-Tang/Sorbet}{https://github.com/Kaiwen-Tang/Sorbet}
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Neural & Evolutionary
R.I.P.
๐ป
Ghosted
Progressive Growing of GANs for Improved Quality, Stability, and Variation
R.I.P.
๐ป
Ghosted
Learning both Weights and Connections for Efficient Neural Networks
R.I.P.
๐ป
Ghosted
LSTM: A Search Space Odyssey
R.I.P.
๐ป
Ghosted
A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks
R.I.P.
๐ป
Ghosted
An Introduction to Convolutional Neural Networks
Died the same way โ ๐ 404 Not Found
R.I.P.
๐
404 Not Found
Deep High-Resolution Representation Learning for Visual Recognition
R.I.P.
๐
404 Not Found
HuggingFace's Transformers: State-of-the-art Natural Language Processing
R.I.P.
๐
404 Not Found
CCNet: Criss-Cross Attention for Semantic Segmentation
R.I.P.
๐
404 Not Found