论文标题
用于几个射击语义分割的自我调节原型网络
Self-Regularized Prototypical Network for Few-Shot Semantic Segmentation
论文作者
论文摘要
图像语义分割中的深入CNN通常需要大量密集的通知图像进行训练,并且在概括看不见的对象类别方面存在困难。因此,很少开发射击分割来进行分割,只有几个带注释的示例。在这项工作中,我们使用基于原型提取的自我调节原型网络(SRPNET)来解决少量射击分割,以更好地利用支持信息。提出的SRPNET从支持图像中提取了特定于类的原型代表,并通过距离度量 - 忠实度生成了查询图像的分割掩码。在SRPNET中提出了直接但有效的原型正规化,因此在SRPNET中提出了在支持集本身上评估和正规化生成的原型。生成的原型恢复支撑蒙版的程度施加了性能的上限。查询集上的性能无论知识从支持集到查询集的概括如何完整,都不应超过上限。借助特定的原型正则化,SRPNET完全利用了支持中的知识,并提供了高质量的原型,这些原型代表了每个语义类别,同时歧视不同类别。通过迭代查询推理(IQI)模块,将查询性能进一步提高,该模块结合了一组正则化原型。我们提出的SRPNET在1摄和5件分段基准上实现了新的最新性能。
The deep CNNs in image semantic segmentation typically require a large number of densely-annotated images for training and have difficulties in generalizing to unseen object categories. Therefore, few-shot segmentation has been developed to perform segmentation with just a few annotated examples. In this work, we tackle the few-shot segmentation using a self-regularized prototypical network (SRPNet) based on prototype extraction for better utilization of the support information. The proposed SRPNet extracts class-specific prototype representations from support images and generates segmentation masks for query images by a distance metric - the fidelity. A direct yet effective prototype regularization on support set is proposed in SRPNet, in which the generated prototypes are evaluated and regularized on the support set itself. The extent to which the generated prototypes restore the support mask imposes an upper limit on performance. The performance on the query set should never exceed the upper limit no matter how complete the knowledge is generalized from support set to query set. With the specific prototype regularization, SRPNet fully exploits knowledge from the support and offers high-quality prototypes that are representative for each semantic class and meanwhile discriminative for different classes. The query performance is further improved by an iterative query inference (IQI) module that combines a set of regularized prototypes. Our proposed SRPNet achieves new state-of-art performance on 1-shot and 5-shot segmentation benchmarks.