AdapterDrop: On the Efficiency of Adapters in Transformers

October 22, 2020 Β· Declared Dead Β· πŸ› Conference on Empirical Methods in Natural Language Processing

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Andreas RΓΌcklΓ©, Gregor Geigle, Max Glockner, Tilman Beck, Jonas Pfeiffer, Nils Reimers, Iryna Gurevych arXiv ID 2010.11918 Category cs.LG: Machine Learning Cross-listed cs.CL Citations 301 Venue Conference on Empirical Methods in Natural Language Processing Last Checked 3 months ago
Abstract
Massively pre-trained transformer models are computationally expensive to fine-tune, slow for inference, and have large storage requirements. Recent approaches tackle these shortcomings by training smaller models, dynamically reducing the model size, and by training light-weight adapters. In this paper, we propose AdapterDrop, removing adapters from lower transformer layers during training and inference, which incorporates concepts from all three directions. We show that AdapterDrop can dynamically reduce the computational overhead when performing inference over multiple tasks simultaneously, with minimal decrease in task performances. We further prune adapters from AdapterFusion, which improves the inference efficiency while maintaining the task performances entirely.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Machine Learning

Died the same way β€” πŸ‘» Ghosted