论文标题
模糊的简单网络:一种以拓扑为灵感的模型,用于改善几次学习中的任务概括
Fuzzy Simplicial Networks: A Topology-Inspired Model to Improve Task Generalization in Few-shot Learning
论文作者
论文摘要
深度学习在具有大量数据的设置中表现出了巨大的成功,但是当数据受到限制时挣扎。旨在解决此限制的射击学习算法很少,旨在使数据概括为具有有限数据的新任务。通常,在看不见的类和数据集上评估模型,这些类别和数据集由与培训相同的基本任务定义(例如类别成员资格)。还可以询问模型如何推广到固定数据集中的根本不同的任务(例如:从类别成员身份转变为涉及检测对象方向或数量的任务)。为了形式化这种转变,我们定义了“任务独立性”的概念,并为已建立的计算机视觉数据集确定了三个新的标签,这些标签测试了模型概括到数据中借鉴正交属性的任务的能力。我们使用这些数据集研究基于公制的几杆模型的故障模式。根据我们的发现,我们介绍了一种名为模糊的简单网络(FSN)的新的几弹模型,该模型利用拓扑结构从有限的数据中更灵活地表示每个类。特别是,FSN模型不仅可以为给定类形成多个表示形式,而且还可以开始捕获低维结构,该结构表征了深网编码空间中类歧管。我们表明,FSN在本文中介绍的具有挑战性的任务上优于最先进的模型,同时在标准的几杆基准上保持竞争力。
Deep learning has shown great success in settings with massive amounts of data but has struggled when data is limited. Few-shot learning algorithms, which seek to address this limitation, are designed to generalize well to new tasks with limited data. Typically, models are evaluated on unseen classes and datasets that are defined by the same fundamental task as they are trained for (e.g. category membership). One can also ask how well a model can generalize to fundamentally different tasks within a fixed dataset (for example: moving from category membership to tasks that involve detecting object orientation or quantity). To formalize this kind of shift we define a notion of "independence of tasks" and identify three new sets of labels for established computer vision datasets that test a model's ability to generalize to tasks which draw on orthogonal attributes in the data. We use these datasets to investigate the failure modes of metric-based few-shot models. Based on our findings, we introduce a new few-shot model called Fuzzy Simplicial Networks (FSN) which leverages a construction from topology to more flexibly represent each class from limited data. In particular, FSN models can not only form multiple representations for a given class but can also begin to capture the low-dimensional structure which characterizes class manifolds in the encoded space of deep networks. We show that FSN outperforms state-of-the-art models on the challenging tasks we introduce in this paper while remaining competitive on standard few-shot benchmarks.