Shake-Shake regularization

May 21, 2017 · Entered Twilight · 🏛 arXiv.org

"Last commit was 7.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: LICENSE, README.md, checkpoints.lua, main.lua, models, opts.lua, train.lua

Authors Xavier Gastaldi arXiv ID 1705.07485 Category cs.LG: Machine Learning Cross-listed cs.CV Citations 393 Venue arXiv.org Repository https://github.com/xgastaldi/shake-shake ⭐ 297 Last Checked 1 month ago

Abstract

The method introduced in this paper aims at helping deep learning practitioners faced with an overfit problem. The idea is to replace, in a multi-branch network, the standard summation of parallel branches with a stochastic affine combination. Applied to 3-branch residual networks, shake-shake regularization improves on the best single shot published results on CIFAR-10 and CIFAR-100 by reaching test errors of 2.86% and 15.85%. Experiments on architectures without skip connections or Batch Normalization show encouraging results and open the door to a large set of applications. Code is available at https://github.com/xgastaldi/shake-shake