贝叶斯MAML的超网络方法

论文标题

贝叶斯MAML的超网络方法

Hypernetwork approach to Bayesian MAML

论文作者

Borycki, Piotr, Kubacki, Piotr, Przewięźlikowski, Marcin, Kuśmierczyk, Tomasz, Tabor, Jacek, Spurek, Przemysław

论文摘要

几次学习算法的主要目标是从少量数据中学习。最受欢迎，最优雅的少数学习方法之一是模型敏捷的元学习（MAML）。这种方法背后的主要思想是学习元模型的共享通用权重，然后将其适用于特定任务。但是，由于数据大小有限，该方法遭受了过度拟合和不良量的不确定性。贝叶斯的方法原则上可以通过学习体重分布来代替点重量来减轻这些缺点。不幸的是，由于高斯后期，基于MAML的梯度重量更新或对通用和适应权重的相同结构的简单性，MAML先前的修改受到限制。在本文中，我们为贝叶斯MAML提出了一个新颖的框架，称为Bayesianhmaml，该框架采用了HyperNetworks进行重量更新。它了解了普遍的权重，但是适应特定任务时会添加概率结构。在这样的框架中，我们可以使用简单的高斯分布或通过连续归一化流动引起的更复杂的后代。

The main goal of Few-Shot learning algorithms is to enable learning from small amounts of data. One of the most popular and elegant Few-Shot learning approaches is Model-Agnostic Meta-Learning (MAML). The main idea behind this method is to learn the shared universal weights of a meta-model, which are then adapted for specific tasks. However, the method suffers from over-fitting and poorly quantifies uncertainty due to limited data size. Bayesian approaches could, in principle, alleviate these shortcomings by learning weight distributions in place of point-wise weights. Unfortunately, previous modifications of MAML are limited due to the simplicity of Gaussian posteriors, MAML-like gradient-based weight updates, or by the same structure enforced for universal and adapted weights. In this paper, we propose a novel framework for Bayesian MAML called BayesianHMAML, which employs Hypernetworks for weight updates. It learns the universal weights point-wise, but a probabilistic structure is added when adapted for specific tasks. In such a framework, we can use simple Gaussian distributions or more complicated posteriors induced by Continuous Normalizing Flows.

下载PDF全文

下载文献需遵守相关版权规定

论文标题