Boosted Dynamic Neural Networks
November 30, 2022 ยท Entered Twilight ยท ๐ AAAI Conference on Artificial Intelligence
Repo contents: LICENSE, LICENSE_MSDNet_PyTorch, README.md, adaptive_inference.py, args.py, dataloader.py, eval_cifar100.py, eval_imagenet.py, figures, models, msdnet_scripts, op_counter.py, ranet_scripts, requirements.txt, train_cifar100.py, train_imagenet.py, utils
Authors
Haichao Yu, Haoxiang Li, Gang Hua, Gao Huang, Humphrey Shi
arXiv ID
2211.16726
Category
cs.LG: Machine Learning
Cross-listed
cs.CV
Citations
15
Venue
AAAI Conference on Artificial Intelligence
Repository
https://github.com/SHI-Labs/Boosted-Dynamic-Networks
โญ 8
Last Checked
1 month ago
Abstract
Early-exiting dynamic neural networks (EDNN), as one type of dynamic neural networks, has been widely studied recently. A typical EDNN has multiple prediction heads at different layers of the network backbone. During inference, the model will exit at either the last prediction head or an intermediate prediction head where the prediction confidence is higher than a predefined threshold. To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data. This brings a train-test mismatch problem that all the prediction heads are optimized on all types of data in training phase while the deeper heads will only see difficult inputs in testing phase. Treating training and testing inputs differently at the two phases will cause the mismatch between training and testing data distributions. To mitigate this problem, we formulate an EDNN as an additive model inspired by gradient boosting, and propose multiple training techniques to optimize the model effectively. We name our method BoostNet. Our experiments show it achieves the state-of-the-art performance on CIFAR100 and ImageNet datasets in both anytime and budgeted-batch prediction modes. Our code is released at https://github.com/SHI-Labs/Boosted-Dynamic-Networks.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
R.I.P.
๐ป
Ghosted
Semi-Supervised Classification with Graph Convolutional Networks
R.I.P.
๐ป
Ghosted
Proximal Policy Optimization Algorithms
R.I.P.
๐ป
Ghosted