R.I.P.
๐ป
Ghosted
Meta-learning the Learning Trends Shared Across Tasks
October 19, 2020 ยท Declared Dead ยท ๐ British Machine Vision Conference
Authors
Jathushan Rajasegaran, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Mubarak Shah
arXiv ID
2010.09291
Category
cs.LG: Machine Learning
Cross-listed
cs.AI,
cs.CV
Citations
9
Venue
British Machine Vision Conference
Repository
https://github.com/brjathu/PAMELA
Last Checked
1 month ago
Abstract
Meta-learning stands for 'learning to learn' such that generalization to new tasks is achieved. Among these methods, Gradient-based meta-learning algorithms are a specific sub-class that excel at quick adaptation to new tasks with limited data. This demonstrates their ability to acquire transferable knowledge, a capability that is central to human learning. However, the existing meta-learning approaches only depend on the current task information during the adaptation, and do not share the meta-knowledge of how a similar task has been adapted before. To address this gap, we propose a 'Path-aware' model-agnostic meta-learning approach. Specifically, our approach not only learns a good initialization for adaptation, it also learns an optimal way to adapt these parameters to a set of task-specific parameters, with learnable update directions, learning rates and, most importantly, the way updates evolve over different time-steps. Compared to the existing meta-learning methods, our approach offers: (a) The ability to learn gradient-preconditioning at different time-steps of the inner-loop, thereby modeling the dynamic learning behavior shared across tasks, and (b) The capability of aggregating the learning context through the provision of direct gradient-skip connections from the old time-steps, thus avoiding overfitting and improving generalization. In essence, our approach not only learns a transferable initialization, but also models the optimal update directions, learning rates, and task-specific learning trends. Specifically, in terms of learning trends, our approach determines the way update directions shape up as the task-specific learning progresses and how the previous update history helps in the current update. Our approach is simple to implement and demonstrates faster convergence. We report significant performance improvements on a number of FSL datasets.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
R.I.P.
๐ป
Ghosted
Semi-Supervised Classification with Graph Convolutional Networks
R.I.P.
๐ป
Ghosted
Proximal Policy Optimization Algorithms
R.I.P.
๐ป
Ghosted
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Died the same way โ ๐ 404 Not Found
R.I.P.
๐
404 Not Found
Deep High-Resolution Representation Learning for Visual Recognition
R.I.P.
๐
404 Not Found
HuggingFace's Transformers: State-of-the-art Natural Language Processing
R.I.P.
๐
404 Not Found
CCNet: Criss-Cross Attention for Semantic Segmentation
R.I.P.
๐
404 Not Found