R.I.P.
👻
Ghosted
SparseByteNN: A Novel Mobile Inference Acceleration Framework Based on Fine-Grained Group Sparsity
October 30, 2023 · Declared Dead · 🏛 arXiv.org
Authors
Haitao Xu, Songwei Liu, Yuyang Xu, Shuai Wang, Jiashi Li, Chenqian Yan, Liangqiang Li, Lean Fu, Xin Pan, Fangmin Chen
arXiv ID
2310.19509
Category
cs.AI: Artificial Intelligence
Cross-listed
cs.CV
Citations
3
Venue
arXiv.org
Repository
https://github.com/lswzjuer/SparseByteNN
Last Checked
1 month ago
Abstract
To address the challenge of increasing network size, researchers have developed sparse models through network pruning. However, maintaining model accuracy while achieving significant speedups on general computing devices remains an open problem. In this paper, we present a novel mobile inference acceleration framework SparseByteNN, which leverages fine-grained kernel sparsity to achieve real-time execution as well as high accuracy. Our framework consists of two parts: (a) A fine-grained kernel sparsity schema with a sparsity granularity between structured pruning and unstructured pruning. It designs multiple sparse patterns for different operators. Combined with our proposed whole network rearrangement strategy, the schema achieves a high compression rate and high precision at the same time. (b) Inference engine co-optimized with the sparse pattern. The conventional wisdom is that this reduction in theoretical FLOPs does not translate into real-world efficiency gains. We aim to correct this misconception by introducing a family of efficient sparse kernels for ARM and WebAssembly. Equipped with our efficient implementation of sparse primitives, we show that sparse versions of MobileNet-v1 outperform strong dense baselines on the efficiency-accuracy curve. Experimental results on Qualcomm 855 show that for 30% sparse MobileNet-v1, SparseByteNN achieves 1.27x speedup over the dense version and 1.29x speedup over the state-of-the-art sparse inference engine MNN with a slight accuracy drop of 0.224%. The source code of SparseByteNN will be available at https://github.com/lswzjuer/SparseByteNN
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
📜 Similar Papers
In the same crypt — Artificial Intelligence
R.I.P.
👻
Ghosted
Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI
R.I.P.
👻
Ghosted
Addressing Function Approximation Error in Actor-Critic Methods
R.I.P.
👻
Ghosted
Explanation in Artificial Intelligence: Insights from the Social Sciences
R.I.P.
👻
Ghosted
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
R.I.P.
👻
Ghosted
Complex Embeddings for Simple Link Prediction
Died the same way — ⚰️ The Empty Tomb
R.I.P.
⚰️
The Empty Tomb
DSFD: Dual Shot Face Detector
R.I.P.
⚰️
The Empty Tomb
InstanceCut: from Edges to Instances with MultiCut
R.I.P.
⚰️
The Empty Tomb
FLNet: Landmark Driven Fetching and Learning Network for Faithful Talking Facial Animation Synthesis
R.I.P.
⚰️
The Empty Tomb