R.I.P.
๐ป
Ghosted
MixtureGrowth: Growing Neural Networks by Recombining Learned Parameters
November 07, 2023 ยท Entered Twilight ยท ๐ IEEE Workshop/Winter Conference on Applications of Computer Vision
Repo contents: .vscode, README.md, compute_flops.py, custom_log.py, main.py, models, requirements.txt, scripts, utils.py
Authors
Chau Pham, Piotr Teterwak, Soren Nelson, Bryan A. Plummer
arXiv ID
2311.04251
Category
cs.LG: Machine Learning
Cross-listed
cs.AI,
cs.CV
Citations
6
Venue
IEEE Workshop/Winter Conference on Applications of Computer Vision
Repository
https://github.com/chaudatascience/mixturegrowth
โญ 9
Last Checked
1 month ago
Abstract
Most deep neural networks are trained under fixed network architectures and require retraining when the architecture changes. If expanding the network's size is needed, it is necessary to retrain from scratch, which is expensive. To avoid this, one can grow from a small network by adding random weights over time to gradually achieve the target network size. However, this naive approach falls short in practice as it brings too much noise to the growing process. Prior work tackled this issue by leveraging the already learned weights and training data for generating new weights through conducting a computationally expensive analysis step. In this paper, we introduce MixtureGrowth, a new approach to growing networks that circumvents the initialization overhead in prior work. Before growing, each layer in our model is generated with a linear combination of parameter templates. Newly grown layer weights are generated by using a new linear combination of existing templates for a layer. On one hand, these templates are already trained for the task, providing a strong initialization. On the other, the new coefficients provide flexibility for the added layer weights to learn something new. We show that our approach boosts top-1 accuracy over the state-of-the-art by 2-2.5% on CIFAR-100 and ImageNet datasets, while achieving comparable performance with fewer FLOPs to a larger network trained from scratch. Code is available at https://github.com/chaudatascience/mixturegrowth.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
R.I.P.
๐ป
Ghosted
Semi-Supervised Classification with Graph Convolutional Networks
R.I.P.
๐ป
Ghosted
Proximal Policy Optimization Algorithms
R.I.P.
๐ป
Ghosted