Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks
December 31, 2019 ยท Entered Twilight ยท ๐ arXiv.org
"No code URL or promise found in abstract"
"Code repo scraped from project page (backfill)"
Evidence collected by the PWNC Scanner
Repo contents: .gitignore, Dockerfile, LICENSE, README.md, __init__.py, assets, configs, evkit, feature_selector, requirements.txt, scripts, tlkit, tnt
Authors
Jeffrey O Zhang, Alexander Sax, Amir Zamir, Leonidas Guibas, Jitendra Malik
arXiv ID
1912.13503
Category
cs.LG: Machine Learning
Cross-listed
cs.CV,
cs.NE,
cs.RO
Citations
29
Venue
arXiv.org
Repository
https://github.com/jozhang97/side-tuning
โญ 72
Last Checked
1 month ago
Abstract
When training a neural network for a desired task, one may prefer to adapt a pre-trained network rather than starting from randomly initialized weights. Adaptation can be useful in cases when training data is scarce, when a single learner needs to perform multiple tasks, or when one wishes to encode priors in the network. The most commonly employed approaches for network adaptation are fine-tuning and using the pre-trained network as a fixed feature extractor, among others. In this paper, we propose a straightforward alternative: side-tuning. Side-tuning adapts a pre-trained network by training a lightweight "side" network that is fused with the (unchanged) pre-trained network via summation. This simple method works as well as or better than existing solutions and it resolves some of the basic issues with fine-tuning, fixed features, and other common approaches. In particular, side-tuning is less prone to overfitting, is asymptotically consistent, and does not suffer from catastrophic forgetting in incremental learning. We demonstrate the performance of side-tuning under a diverse set of scenarios, including incremental learning (iCIFAR, iTaskonomy), reinforcement learning, imitation learning (visual navigation in Habitat), NLP question-answering (SQuAD v2), and single-task transfer learning (Taskonomy), with consistently promising results.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
R.I.P.
๐ป
Ghosted
Semi-Supervised Classification with Graph Convolutional Networks
R.I.P.
๐ป
Ghosted
Proximal Policy Optimization Algorithms
R.I.P.
๐ป
Ghosted