Accepted Paper: Few-Shot Learning with Adaptively Initialized Task Optimizer

Session 2: Multi-task Learning, NLP, Computer Vision, Applications -- Day 2 (Nov.18), talks: 09:00-11:00 (5th floor Hall 2), poster session: 11:00-13:30
Poster number: Mon23

Authors

Han-Jia Ye (Nanjing University); Xiang-Rong Sheng (National Key Laboratory of Novel Software Technology, Nanjing University); De-Chuan Zhan (Nanjing University)

Abstract

Considering the data collection and labeling cost in real-world applications, training a model with limited examples is an essential problem in machine learning, visual recognition, etc. Directly training a model on such few-shot learning (FSL) tasks falls into the over-fitting dilemma, while in this case, an effective task-level inductive bias acts as the key supervision. By treating the task as an entirety, extracting task-level pattern, and learning a common model initialization, the Model-Agnostic Meta-Learning (MAML) approach enables the applications of various models on the FSL tasks. Given a training set with a few examples, MAML optimizes a model via fixed gradient descent steps from an initial point chosen beforehand. Although possessing generality and empirically satisfied results, the model-agnostic initialization neglects the task-specific characteristics and heavies the computational burden as well. In this manuscript, we propose our Adaptively Initialized Task OptimizeR (Aviator) approach for few-shot learning, which incorporates task context into the determination of the model initialization. This task-specific initialization facilitates the model optimization process a lot, from both solution quality and efficiency perspectives. To this end, we decouple the model and apply a set transformation over the training set to determine the initial top-layer classifier. Re-parameterization of the first-order gradient descent approximation promotes the gradient back-propagation. Experiments on synthetic and benchmark data sets validate that our Aviator approach achieves the state-of-the-art performance, and visualization results demonstrate the task-adaptive features of our proposed Aviator method.